Package short description.
Project description
Welcome to s3iotools Documentation
Usage
import boto3
import pandas as pd
from s3iotools.io.dataframe import S3Dataframe
session = boto3.Session(profile_name="xxx")
s3 = session.resource("s3")
bucket_name = "my-bucket"
s3df = S3Dataframe(s3_resource=s3, bucket_name=bucket_name)
s3df.df = pd.DataFrame(...)
s3df.to_csv(key="data.csv")
s3df.to_csv(key="data.csv.gz", gzip_compressed=True)
s3df_new = S3Dataframe(s3_resource=s3, bucket_name=bucket_name, key="data.csv")
s3df_new.read_csv()
s3df_new.df # access data
s3df_new = S3Dataframe(s3_resource=s3, bucket_name=bucket_name, key="data.csv.gz")
s3df_new.read_csv(gzip_compressed=True)
s3df_new.df # access data
json IO is similar.
s3df = S3Dataframe(s3_resource=s3, bucket_name=bucket_name)
s3df.df = pd.DataFrame(...)
s3df.to_json(key="data.json.gz", gzip_compressed=True)
s3df_new = S3Dataframe(s3_resource=s3, bucket_name=bucket_name, key="data.json.gz")
s3df_new.read_json(gzip_compressed=True)
s3df_new.df # access data
parquet is a columnar storage format, which is very efficient for OLAP query. You can just put data on S3, then use AWS Athena to query parquet files. parquet IO in s3iotools is easy:
s3df = S3Dataframe(s3_resource=s3, bucket_name=bucket_name)
s3df.df = pd.DataFrame(...)
s3df.to_parquet(key="data.parquet", compression="gzip")
s3df_new = S3Dataframe(s3_resource=s3, bucket_name=bucket_name, key="data.parquet")
s3df_new.read_parquet()
s3df_new.df # access data
s3iotools doesn’t automatically install pyarrow, you can install it with pip install pyarrow.
Install
s3iotools is released on PyPI, so all you need is:
$ pip install s3iotools
To upgrade to latest version:
$ pip install --upgrade s3iotools
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
s3iotools-0.0.2.tar.gz
(32.5 kB
view hashes)
Built Distribution
Close
Hashes for s3iotools-0.0.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ce0bb74c73b60cabe915689bd5d54542042310547aaa8f1e91aff21d3cca3c31 |
|
MD5 | 78cb6b05cef4dd48b2716907582a9aea |
|
BLAKE2b-256 | 2732ea8d190524052c15b3021b2f6f9efd082d59289e4b0a164412c5f209ac5c |