batchpy

A package to efficiently run batches of similar calculations

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

A python package to run large or small batches of similar calculations and storing the results so no double calculations are performed.

The example below should explain the workflow.

Installation

requires:

numpy

To install run:

pip install batchpy

Example

First import batchpy:

import batchpy

Create a run class, subclassing the batchpy.Run class to create objects with a run method which when called performs the required calculations and return a result dictionary. All parameters should be passed as named parameters to add default and to make their names available:

class example_run(batchpy.Run):
    """
    An example run class
    """
    def run(self,A=0,B=[1,2,3],operator=max):
        """
        An example computation function

        """

        print(self.parameters)

        res = {'val': self.parameters['A']*self.parameters['operator'](self.parameters['B'])}

        return res

Now define a batch using batchpy.Batch and supply a name to the batch. The name will be used to identify results files.

batch = batchpy.Batch('my_batch')

Result files are saved and retrieved from a subdirectory “_res” of the base path. If this directory doesn’t exist it will be created.

Next we can add runs to our batch. This can be done run per run:

batch.add_run( example_run,{'A':10,'B':[3,2,4,3,8]})

Or from a full factorial design:

batch.add_factorial_runs( example_run,
                     {'A': [1,2,3,4,5],
                      'B': [[2,5,8],[1,9,6,3,9],[6,4,0,9,4,1]],
                      'operator': [min,max,sum,len]})

All calculations can be executed by calling the batch:

batch()

Results can be retrieved by loading them. This is required as they are not kept in memory to allow large batches to be run:

res = batch.run[0].load()

Results are stored in the _res folder in a .npy format.

When a file containing a batch definition is rerun, the calculation that have already run (with id’s present in the saved file) will not be rerun. This makes the class useful for runs with long computation times. We can for instance extend the batch with an additional run:

batch.add_run( example_run,{'A':8,'operator':min})

Using the attribute done, we can check which runs are done and which need to be executed:

print([run.done for run in batch.run])

Calling the the batch again will execute only those runs which have not been run yet:

batch()

Try closing and restarting python and rerun the above code. You will notice no new calculations are performed, all results are loaded from the previously saved file. You can also try changing one parameter in a run definition, now only the changed runs will be rerun.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.3.0

Oct 13, 2017

1.2.2

Oct 8, 2017

1.2.1

Oct 8, 2017

1.2.0

Oct 7, 2017

1.1.4

Oct 3, 2017

1.1.3

Oct 3, 2017

1.1.2

Sep 18, 2017

1.1.1

Sep 18, 2017

1.1.0

Feb 22, 2017

1.0.1

Feb 20, 2017

This version

1.0.0

Feb 20, 2017

0.0.1

Feb 20, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

batchpy-1.0.0.zip (25.8 kB view hashes)

Uploaded Feb 20, 2017 Source

Hashes for batchpy-1.0.0.zip

Hashes for batchpy-1.0.0.zip
Algorithm	Hash digest
SHA256	`da5fdcdd626e036f80b279ee513130337db7dd9b55028ce0e9df9833a7d36eef`
MD5	`e418f320c72ffe2729d9c63b18aa8a6b`
BLAKE2b-256	`74634721890dc1c0f412b03d4278cdcfccbaaefc86a402d0932ff53a6cad9214`