Python Pachyderm Client
Project description
Pachyderm's Python SDK
Official Python client/SDK for Pachyderm. The successor to https://github.com/pachyderm/python-pachyderm.
This library provides the autogenerated gRPC/protobuf code for Pachyderm, generated using a fork of the betterproto package, along with higher-level functionality.
Installation
pip install pachyderm_sdk
A Small Taste
Here's an example that creates a repo and adds a file:
from pachyderm_sdk import Client
from pachyderm_sdk.api import pfs
# Connects to a pachyderm cluster using your local config
# at ~/.pachyderm/config.json
client = Client.from_config()
# Creates a pachyderm repo called `test`
repo = pfs.Repo(name="test")
client.pfs.create_repo(repo=repo)
# Create a new commit in `test@master` and upload a file.
branch = pfs.Branch.from_uri("test@master")
with client.pfs.commit(branch=branch) as commit:
file = commit.put_file_from_bytes(path="/data/file.dat", data=b"DATA")
# Retrieve the uploaded file.
with client.pfs.pfs_file(file) as f:
print(f.readall())
How to load a CAST file into a pandas dataframe
from pachyderm_sdk import Client
from pachyderm_sdk.api import pfs
import pandas as pd
client = Client.from_config()
file = pfs.File.from_uri("test@master:/path/to/data.csv")
with client.pfs.pfs_file(file) as f:
df = pd.read_csv(f)
Changes from Python-Pachyderm
This package is a successor to the python-pachyderm package. Listed below are some of the notable changes:
- Organization of the API
- Methods and Message objects are now organized according to the service they are associated with, i.e. auth, pfs (pachyderm file-system), pps (pachyderm pipelining-system).
- Message objects can be found within their respective submodule of the
pachyder_sdk.api
module, i.e.pachyderm_sdk.api.pfs
. - Methods can be found within their respective attribute of the
Client
class, i.e.client.pps.create_pipeline
.- Some methods have been renamed to remove redundancy due to this organization, i.e.
python_pachyderm.Client.get_enterprise_state
->pachyderm_sdk.Client.enterprise.get_state
- Some methods have been renamed to remove redundancy due to this organization, i.e.
- The autogenerated code is generated using a fork of the betterproto compiler.
- Messages are now python dataclasses.
- Methods require keyword arguments.
- Pachyderm resources are specified using types.
- python-pachyderm (old):
client.create_repo("test")
- pachyderm_sdk (new):
client.pfs.create_repo(repo=pfs.Repo(name="test"))
- python-pachyderm (old):
Contributing
Please see the contributing guide for more info (including testing instructions)
Developer Guide
Generate python APIs from protobuf:
./generate-protos.sh
Generate HTML documentation (writes to docs/pachyderm_sdk):
make docs
Running Tests:
pytest -vvv tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pachyderm_sdk-2.10.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16fd612f90c8b894ac469319526dbe11b37ab01e4e7150c4c9473484fceae1a4 |
|
MD5 | 1521c3a08980bed48b979f36ced1ce25 |
|
BLAKE2b-256 | daf68377639091f3b8a1c314ba9b8c7006cd0cbb2db8585b6a80565bee536c86 |