Skip to main content

Bearsql aadds sql syntax on pandas dataframe. It uses duckdb to speedup the pandas processing and as the sql engine

Project description

bearsql

https://img.shields.io/pypi/v/bearsql.svg Documentation Status

Bearsql adds sql syntax on pandas dataframe. It uses duckdb to speedup the pandas processing and as the sql engine

Basic Usage

To use bearsql in a project:

from bearsql import SqlContext
import pandas as pd

sc = SqlContext()
# The above statement will create duckdb instance in memory. Once the session ends, the database will be erased and not be persisted
# To persist the database, you can instantiate sqlcontext like:
# sc = SqlContext(database='<YOUR_DATABASE_NAME>.db'

df = pd.DataFrame([{'name': 'John Doe', 'city': 'New York', 'age': 24}, {'name': 'Jane Doe', 'city': 'Chicago', 'age': 27}])

# Create table from pandas dataframe
sc.register_table(df, 'testable') # <YOUR_TABLENAME> instead of 'testable'

# Query table and output to pandas dataframe
results = sc.sql('select * from testable', output='df')
output_df = next(results)
print(output_df)

# Query table and output to pyarrow table
results = sc.sql('select * from testable', output='arrow')
output_arrow_table = next(results)
print(output_arrow_table)

# Query table and output raw tuples
results = sc.sql('select * from testable', output='any')
output_rows = next(results)
print(output_rows)

Create a relational table from dataframe and apply some operations:

rel = sc.relation(df, 'new_relation') # <YOUR_RELATION_NAME> instead of new_relation

print(rel.filter('age > 24'))

# OR convert to df:

rel.filter('age > 24').df()

Export the data to filesystem:

result = sc.sql('EXPORT DATABASE \'<OUTPUT_FOLDER>\' (FORMAT PARQUET);') # format can either be PARQUET or CSV
list(result)

For more examples, please visit https://github.com/duckdb/duckdb/blob/master/examples/python/duckdb-python.py

Features

  • TODO

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2021-06-30)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

bearsql-0.1.0-py2.py3-none-any.whl (6.8 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page