Skip to main content

Checkpoint

Project description

https://travis-ci.org/mpavan/ediblepickle.png?branch=master

ediblepickle is an Apache v 2.0 licensed checkpointing utility. The simplest use case is to checkpoint an expensive computation that need not be repeated every time the program is executed, as in:

import string
import time
from ediblepickle import checkpoint

# A checkpointed expensive function
@checkpoint(key=string.Template('m{0}_n{1}_${iterations}_$stride.csv'), work_dir='/tmp/intermediate_results', refresh=True)
def expensive_computation(m, n, iterations=4, stride=1):
    for i in range(iterations):
        time.sleep(1)
    return range(m, n, stride)

# First call, evaluates the function and saves the results
begin = time.time()
expensive_computation(-100, 200, iterations=4, stride=2)
time_taken = time.time() - begin

print time_taken

# Second call, since the checkpoint exists, the result is loaded from that file and returned.
begin = time.time()
expensive_computation(-100, 200, iterations=4, stride=2)
time_taken = time.time() - begin

print time_taken

Features

  • Generic Decorator API

  • Checkpoint expensive functions to avoid having to re-compute while developing

  • Configurable computation cache storage format (i.e use human friendly keys and data, instead of pickle binary data)

  • Specify refresh to flush the cache and recompute

  • Specify your own serialize/de-serialize (save/load) functions

  • Python logging, just define your own logger to activate it

Installation

To install ediblepickle, simply:

$ pip install ediblepickle

Or:

$ easy_install ediblepickle

Examples

Another nice feature is the ability to define your own serializers and deserializers such that they are human readable. For instance, you can use numpy/scipy utils to save matrices or csv files to debug:

import string
import time
from ediblepickle import checkpoint
from similarity.utils import dict_config

def my_pickler(integers, f):
    print integers
    for i in integers:
        f.write(str(i))
        f.write('\n')

def my_unpickler(f):
    return f.read().split('\n')

@checkpoint(key=string.Template('m{0}_n{1}_${iterations}_$stride.csv'),
            pickler=my_pickler,
            unpickler=my_unpickler,
            refresh=False)
def expensive_computation(m, n, iterations=4, stride=1):
    for i in range(iterations):
        time.sleep(1)
    return range(m, n, stride)

begin = time.time()
print expensive_computation(-100, 200, iterations=4, stride=2)
time_taken = time.time() - begin

print time_taken

begin = time.time()
print expensive_computation(-100, 200, iterations=4, stride=2)
time_taken = time.time() - begin

print time_taken

Contribute

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug.

  2. Fork the repository on GitHub to start making your changes to the master branch (or branch off of it).

  3. Write a test which shows that the bug was fixed or that the feature works as expected.

  4. Send a pull request and bug the maintainer until it gets merged and published. :) Make sure to add yourself to AUTHORS.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ediblepickle-1.0.1.tar.gz (5.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page