A tiny pipeline builder

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

chkpt

A _tiny pipeline builder

What

chkpt is a zero-dependency, 100-line library that makes it easy to define and execute checkpointed pipelines.

It features...

Fluent pipeline construction
Transparent caching of expensive operations
JSON serialization

How

Defining a `Stage`

Stages are the atomic units of work in chkpt and correspond to single Python functions. Existing functions need only use a decorator @chkpt.Stage.wrap() to be used as a Stage:

@chkpt.Stage.wrap()
def stage1():
  return "123"

# stage1 is now a Stage instance
assert isinstance(stage1, chkpt.Stage)

# but the original function is still accessible
assert stage1.func() == "123"

Stages can also accept parameters to be provided by other Stages in the final Pipeline:

@chkpt.Stage.wrap()
def stage2(stage1_input):
  return [stage1_input, "456"]

Defining a `Pipeline`

Pipelines define the excution graph of Stages to be run. Stages are combined with shift operators (<< and >>) to direct the dataflow:

# Each defines a pipeline calculating `stage1` and passing its output to `stage2`.
pipeline = stage1 >> stage2
pipeline = stage2 << stage1
pipeline = stage2 << (stage1,)
pipeline = (stage1,) >> stage2
pipeline = () >> stage1 >> stage2

More complex pipelines should be defined from the leaves down:

result1 = (stage1, stage2) >> stage3
result2 = (result1, stage1) >> stage4
pipeline = result2 >> stage5

Executing a `Pipeline`

Pipelines can be directly executed which will use the default config settings:

result = pipeline()

The defaults can be configured by passing a Config instance:

# Will store all stage results and attempt to load already-stored results, if present.
result = pipeline(chkpt.Config(store=True, load=True, dir='/tmp'))

Examples

For detailed usage, see the examples/ directory.

The following is a brief example pipeline:

import chkpt


@chkpt.Stage.wrap()
def make_dataset1():
  ...

@chkpt.Stage.wrap()
def big_download2():
  ...

@chkpt.Stage.wrap()
def work_in_progress_analysis(dataset1, dataset2):
  ...

pipeline = (make_dataset1, big_download2) >> work_in_progress_analysis
# Work-intensive inputs only run once, caching on reruns.
result = pipeline(chkpt.Config(load=[make_dataset1, big_download2]))

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.0

Dec 22, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chkpt-0.1.0.tar.gz (4.2 kB view hashes)

Uploaded Dec 22, 2021 Source

Built Distribution

chkpt-0.1.0-py3-none-any.whl (4.3 kB view hashes)

Uploaded Dec 22, 2021 Python 3

Hashes for chkpt-0.1.0.tar.gz

Hashes for chkpt-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`4b8553e61698b8452b19e8b2efbb01923fce59c444bf1b9c813d14091d999b8c`
MD5	`fbcfc2bb41114f18bb1f43add1dafb9b`
BLAKE2b-256	`9f506ef44e733536d8fe17c12b5b044a208a3886444c3accf87df4fd2e63a3ba`

Hashes for chkpt-0.1.0-py3-none-any.whl

Hashes for chkpt-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9116898e3d219d6a8064d438e979ed5d919fcce51c6b0e9cc72f85ed7fd3cbb5`
MD5	`9246fc6bbd00d9522228ff5aea6acd16`
BLAKE2b-256	`d18d53f4d17aef83020862e9030d175ffc28ef04426fa82d0b3bd283f8ee978b`

chkpt 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

chkpt

What

How

Defining a `Stage`

Defining a `Pipeline`

Executing a `Pipeline`

Examples

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

chkpt 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

chkpt

What

How

Defining a Stage

Defining a Pipeline

Executing a Pipeline

Examples

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

Defining a `Stage`

Defining a `Pipeline`

Executing a `Pipeline`