fastavro

Fast iteration of AVRO files

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

fastavro

The current Python avro package is packed with features but dog slow.

On a test case of about 10K records, it takes about 14sec to iterate over all of them. In comparison the JAVA avro SDK does it in about 1.9sec.

fastavro is less feature complete than avro, however it’s much faster. It iterates over the same 10K records in 2.9sec, and if you use it with PyPy it’ll do it in 1.5sec (to be fair, the JAVA benchmark is doing some extra JSON encoding/decoding).

If the optional C extension (generated by Cython) is available, then fastavro will be even faster. For the same 10K records it’ll run in about 1.7sec.

Usage

Reading

import fastavro as avro

with open('weather.avro', 'rb') as fo:
    reader = avro.reader(fo)
    schema = reader.schema

    for record in reader:
        process_record(record)

Writing

from fastavro import writer

schema = {
    'doc': 'A weather reading.',
    'name': 'Weather',
    'namespace': 'test',
    'type': 'record',
    'fields': [
        {'name': 'station', 'type': 'string'},
        {'name': 'time', 'type': 'long'},
        {'name': 'temp', 'type': 'int'},
    ],
}

records = [
    {u'station': u'011990-99999', u'temp': 0, u'time': 1433269388},
    {u'station': u'011990-99999', u'temp': 22, u'time': 1433270389},
    {u'station': u'011990-99999', u'temp': -11, u'time': 1433273379},
    {u'station': u'012650-99999', u'temp': 111, u'time': 1433275478},
]

with open('weather.avro', 'wb') as out:
    writer(out, schema, records)

You can also use the fastavro script from the command line to dump avro files.

fastavro weather.avro

By default fastavro prints one JSON object per line, you can use the –pretty flag to change this.

You can also dump the avro schema:

fastavro --schema weather.avro

Here’s the full command line help

usage: fastavro [-h] [--schema] [--codecs] [--version] [-p] [file [file ...]]

iter over avro file, emit records as JSON

positional arguments:
  file          file(s) to parse

optional arguments:
  -h, --help    show this help message and exit
  --schema      dump schema instead of records
  --codecs      print supported codecs
  --version     show program's version number and exit
  -p, --pretty  pretty print json

Limitations

No reader schema

Hacking

As recommended by Cython, the C files output is distributed. This has the advantage that the end user does not need to have Cython installed. However it means that every time you change fastavro/pyfastavro.py you need to run make.

For make to succeed you need both python and python3 installed, cython on both of them. For ./test-install.sh you’ll need virtualenv.

Builds

We’re currently using travis.ci

Changes

See the ChangeLog

Contact

Miki Tebeka <miki.tebeka@gmail.com> https://bitbucket.org/tebeka/fastavro

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.9.4

Feb 13, 2024

1.9.3

Jan 9, 2024

1.9.2

Dec 21, 2023

1.9.1

Dec 6, 2023

1.9.0

Oct 28, 2023

1.8.4

Oct 3, 2023

1.8.3

Sep 7, 2023

1.8.2

Jul 19, 2023

1.8.1

Jul 17, 2023

1.8.0

Jul 6, 2023

1.7.4

May 4, 2023

1.7.3

Mar 8, 2023

1.7.2

Feb 22, 2023

1.7.1

Jan 27, 2023

1.7.0

Oct 26, 2022

1.6.1

Sep 9, 2022

1.6.0

Aug 15, 2022

1.5.4

Jul 29, 2022

1.5.3

Jul 19, 2022

1.5.2

Jun 27, 2022

1.5.1

Jun 8, 2022

1.5.0

Jun 7, 2022

1.4.12

May 18, 2022

1.4.11

Apr 27, 2022

1.4.10

Mar 4, 2022

1.4.9

Jan 8, 2022

1.4.8

Dec 27, 2021

1.4.7

Oct 29, 2021

1.4.6

Oct 23, 2021

1.4.5

Sep 23, 2021

1.4.4

Jul 22, 2021

1.4.3

Jul 16, 2021

1.4.2

Jun 28, 2021

1.4.1

May 18, 2021

1.4.0

Apr 16, 2021

1.3.5

Mar 31, 2021

1.3.4

Mar 20, 2021

1.3.3

Mar 13, 2021

1.3.2

Feb 14, 2021

1.3.1

Feb 6, 2021

1.3.0

Jan 21, 2021

1.2.4

Jan 17, 2021

1.2.3

Dec 24, 2020

1.2.2

Dec 23, 2020

1.2.1

Dec 2, 2020

1.2.0

Nov 19, 2020

1.1.1

Nov 17, 2020

1.1.0

Oct 30, 2020

1.0.0.post1

Aug 26, 2020

1.0.0 yanked

Aug 23, 2020

Reason this release was yanked:

Missing python_requires to prevent Python 2 users from installing

0.24.2

Aug 18, 2020

0.24.1

Aug 16, 2020

0.24.0

Jul 30, 2020

0.23.6

Jul 12, 2020

0.23.5

Jun 22, 2020

0.23.4

May 15, 2020

0.23.3

Apr 29, 2020

0.23.2

Apr 18, 2020

0.23.1

Apr 3, 2020

0.23.0

Mar 23, 2020

0.22.13

Mar 3, 2020

0.22.12

Feb 27, 2020

0.22.11

Feb 26, 2020

0.22.10

Feb 24, 2020

0.22.9

Dec 20, 2019

0.22.8

Dec 16, 2019

0.22.7

Nov 6, 2019

0.22.6

Nov 3, 2019

0.22.5

Sep 19, 2019

0.22.4

Aug 26, 2019

0.22.3

Jul 12, 2019

0.22.2

Jun 28, 2019

0.22.1

Jun 14, 2019

0.22.0

Jun 13, 2019

0.21.24

May 29, 2019

0.21.23

May 7, 2019

0.21.22

Apr 27, 2019

0.21.21

Apr 19, 2019

0.21.20

Apr 3, 2019

0.21.19

Mar 3, 2019

0.21.18

Feb 14, 2019

0.21.17

Jan 22, 2019

0.21.16

Dec 22, 2018

0.21.15

Dec 10, 2018

0.21.14

Nov 17, 2018

0.21.13

Nov 12, 2018

0.21.12

Nov 1, 2018

0.21.11

Oct 30, 2018

0.21.10

Oct 25, 2018

0.21.9

Oct 9, 2018

0.21.8

Sep 25, 2018

0.21.7

Sep 17, 2018

0.21.6

Sep 16, 2018

0.21.5

Sep 4, 2018

0.21.4

Jul 25, 2018

0.21.3

Jul 12, 2018

0.21.2

Jul 12, 2018

0.21.1

Jul 10, 2018

0.21.0

Jul 9, 2018

0.21.0rc1 pre-release

Jul 6, 2018

0.20.0

Jul 3, 2018

0.19.9

Jun 29, 2018

0.19.8

Jun 27, 2018

0.19.7

Jun 13, 2018

0.19.6

May 31, 2018

0.19.5

May 29, 2018

0.19.4

May 22, 2018

0.19.3

May 20, 2018

0.19.2

May 18, 2018

0.19.1

May 16, 2018

0.19.0

May 14, 2018

0.18.2

May 8, 2018

0.18.1

May 2, 2018

0.18.0

Apr 27, 2018

0.17.10

Mar 30, 2018

0.17.9

Mar 1, 2018

0.17.8

Feb 12, 2018

0.17.7

Feb 1, 2018

0.17.6

Feb 1, 2018

0.17.5

Jan 24, 2018

0.17.4

Jan 22, 2018

0.17.3

Jan 19, 2018

0.17.2

Jan 18, 2018

0.17.1

Dec 27, 2017

0.17.0

Dec 26, 2017

0.16.7

Dec 24, 2017

0.16.6

Dec 13, 2017

0.16.5

Dec 12, 2017

0.16.4

Dec 5, 2017

0.16.3

Nov 29, 2017

0.16.1

Nov 27, 2017

0.14.11

Nov 7, 2017

0.14.10

Sep 18, 2017

0.14.9

Sep 10, 2017

0.14.8

Sep 2, 2017

0.14.7

Aug 8, 2017

0.14.6

Aug 2, 2017

0.14.5

Jul 15, 2017

0.14.3

Jun 24, 2017

0.14.2

Jun 8, 2017

0.14.1

Jun 7, 2017

0.14.0

Jun 3, 2017

0.13.0

May 2, 2017

0.12.2

Apr 19, 2017

0.12.1

Dec 9, 2016

0.12.0

Dec 8, 2016

0.11.1

Nov 25, 2016

0.11.0

Oct 20, 2016

0.10.2

Aug 1, 2016

0.10.1

Jul 3, 2016

0.9.11

Jun 12, 2016

0.9.10

Jun 7, 2016

0.9.9

Feb 13, 2016

0.9.8

Jan 15, 2016

0.9.7

Dec 27, 2015

0.9.6

Oct 14, 2015

0.9.5

Oct 4, 2015

0.9.4

Oct 1, 2015

0.9.3

Sep 1, 2015

0.9.2

Aug 25, 2015

0.9.1

Aug 21, 2015

0.9.0

Aug 20, 2015

0.8.8

Aug 18, 2015

0.8.7

Aug 15, 2015

0.8.6

Aug 13, 2015

0.8.5

Aug 5, 2015

0.8.4

Aug 3, 2015

0.8.3

Jul 16, 2015

0.8.2

Jul 14, 2015

This version

0.8.1

Jun 2, 2015

0.8.0

May 4, 2015

0.7.10

Apr 28, 2015

0.7.9

Aug 28, 2014

0.7.8

May 20, 2014

0.7.7

Mar 27, 2013

0.7.6

Mar 27, 2013

0.7.5

Mar 23, 2013

0.7.4

Mar 2, 2013

0.7.3

Feb 19, 2013

0.7.2

Jan 8, 2013

0.7.1

Dec 11, 2012

0.7.0

Dec 11, 2012

0.6.10

Oct 6, 2012

0.6.9

Oct 6, 2012

0.6.8

Jul 14, 2012

0.6.7

Apr 29, 2012

0.6.6

Apr 29, 2012

0.6.5

Apr 25, 2012

0.6.4

Apr 14, 2012

0.6.3

Mar 16, 2012

0.6.2

Mar 13, 2012

0.6.1

Mar 12, 2012

0.6.0

Mar 10, 2012

0.5.0

Feb 23, 2012

0.4.2

Jan 30, 2012

0.4.1

Jan 25, 2012

0.4.0

Jan 25, 2012

0.3.2

Jan 24, 2012

0.3.1

Jan 24, 2012

0.3.0

Jan 23, 2012

0.2.2

Jan 11, 2012

0.2.1

Jan 11, 2012

0.2.0

Jan 11, 2012

0.1.0

Jan 5, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastavro-0.8.1.tar.gz (168.1 kB view hashes)

Uploaded Jun 2, 2015 Source

Built Distributions

fastavro-0.8.1-py3.4-linux-x86_64.egg (599.4 kB view hashes)

Uploaded Jun 2, 2015 Source

fastavro-0.8.1-py2.7-linux-x86_64.egg (193.2 kB view hashes)

Uploaded Jun 2, 2015 Source

Hashes for fastavro-0.8.1.tar.gz

Hashes for fastavro-0.8.1.tar.gz
Algorithm	Hash digest
SHA256	`fa30ba0c00e8882162b5b40533b2d007046adb34797c6e52d6835d987cd234c7`
MD5	`d816aba439f8f085956392617b996571`
BLAKE2b-256	`a5ed1baca57d2d32e1bc17284f96d3818f4cab43c7c24dd096be1782a07db955`

Hashes for fastavro-0.8.1-py3.4-linux-x86_64.egg

Hashes for fastavro-0.8.1-py3.4-linux-x86_64.egg
Algorithm	Hash digest
SHA256	`733bbe9efa707fb077cdac7109bcb833e1e3fc26ab88a70be8805f951b009cdc`
MD5	`fd9a7153a38325c9a8623010328837f1`
BLAKE2b-256	`c2b909c9c2012d88794681e3ad0fe0b2f26807b18d2b053dbf80c6dc21926015`

Hashes for fastavro-0.8.1-py2.7-linux-x86_64.egg

Hashes for fastavro-0.8.1-py2.7-linux-x86_64.egg
Algorithm	Hash digest
SHA256	`18cc09b698e8f4866d3f1d3fbd03298e313caf1f667c5ab0b0b5ce496377e8a5`
MD5	`d10c0b5acf1b400921afc6cb6ecdb2ca`
BLAKE2b-256	`db89737f72b07a1ec1ab58076878134558a7b653e01408d4f38e68d81d48beb2`