Avro reader for Dask.
Project description
Dask-Avro
Avro reader for Dask.
Free software: MIT license
Documentation: https://dask-avro.readthedocs.org.
Python versions: 2.7, 3.5+
Features
This projects provides an Avro format reader for Dask. Provides a convenient function to read one or more Avro files and partition them arbitrarily.
Quickstart
Usage:
import dask.bag import dask_avro delayeds = dask_avro.read_avro("data-*.avro", blocksize=2**26) data = dask.bag.from_delayed(delayeds)
Credits
This package was created with Cookiecutter and the rmax/cookiecutter-pypackage project template.
History
0.3.0 (2018-06-16)
Fixed support for latest fastavro release.
Require fastavro>=0.17.
0.2.1 (2018-06-15)
Pin fastavro version to <=0.19.6 due to breaking changes.
0.2.0 (2018-02-12)
Added support for fastavro 0.16+.
0.1.2 (2018-02-12)
Fix compatibility with dask 0.17.0.
0.1.1 (2018-01-18)
Pin fastavro version to <0.16 as latest versions don’t allow to use internal C-based _iter_avro function.
0.1.0 (2017-02-02)
First release on PyPI.