Python Driver for Apache Drill.
Project description
pydrill
Python Driver for Apache Drill.
Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage
Free software: MIT license
Documentation: https://pydrill.readthedocs.org.
Features
Python 2/3 compatibility,
Support for all rest API calls inluding profiles/options/metrics docs with full list.
Mapping Results to internal python types,
Compatibility with Pandas data frame,
Drill Authentication using PAM,
Installation
Version from https://pypi.python.org/pypi/pydrill:
$ pip install pydrill
Latest version from git:
$ pip install git+git://github.com/PythonicNinja/pydrill.git
Sample usage
from pydrill.client import PyDrill drill = PyDrill(host='localhost', port=8047) if not drill.is_active(): raise ImproperlyConfigured('Please run Drill first') yelp_reviews = drill.query(''' SELECT * FROM `dfs.root`.`./Users/macbookair/Downloads/yelp_dataset_challenge_academic_dataset/yelp_academic_dataset_review.json` LIMIT 5 ''') for result in yelp_reviews: print("%s: %s" %(result['type'], result['date'])) # pandas dataframe df = yelp_reviews.to_dataframe() print(df[df['stars'] > 3])
History
0.3.4 (2017-04-24)
Updated pypi listing long_description
0.3.3 (2017-04-24)
Fix pypi installation
0.3.2 (2017-04-18)
Support for dtype on to_dataframe
0.3.1 (2017-03-06)
Support for Drill Authentication using PAM
0.3 (2017-02-15)
requests response encoding (utf-8)
support Python 3.6 support
0.1.1 (2016-05-21)
Anaconda requirements fixed
0.1.0 (2016-05-19)
First minor release
Updated docs
0.0.2 (2016-04-24)
First release on PyPI.
Implementation of metrics/storage/options/stats
Builds are tested by docker container with Apache Drill running
support for pandas with ResultQuery.to_dataframe
0.0.1 (2015-12-28)
Project start