Skip to main content

An ORM for hdf5 using h5py for basic IO

Project description

Introduction

H5pom is an object mapper for keeping a python object hierarchy synchronized with an HDF5 data file. Think of this as an ORM for HDF5 based storage. Under the hood H5pom uses h5py for communicating with the data store, but this dependency is intended to be entirely abstracted away by the the object mapper layer.

The requirements to save and restore objects from an HDF5 data file using H5pom are modest and in keeping with the usual requirements for using another ORM like Sqlalchemy and Django. The only needs are to derive your domain specific classes from h5pom.Object and declare your attributes as class attributes. Here is an example usage:

>>> import h5pom
>>> class Person(h5pom.Object):
...     name = h5pom.Scalar()
...     age = h5pom.Scalar()
...
>>> f = h5pom.open(h5pom.IN_MEMORY, [Person])
>>> p = Person(name='Joe', age=25)
>>> f['joe'] = p
>>> assert f['joe'].name == 'Joe'
>>> p.age = 23
>>> assert f['joe'].age == 23
>>> f.close()

In that example, there is no file made on disk due to the use of h5pom.IN_MEMORY used as the file name, but the code is representative and complete for the simple notion of saving Person objects to an HDF5 file.

Goals of H5pom

H5pom aims to make it easy to store and synchronize python object graphs which may or may not have numeric array data in an HDF5 file. For array types of known numpy dtype H5pom aims to expose the full performance of HDF5 and h5py as appropriate.

H5pom does not aim to have the full generality of reading arbitrary HDF5 files. The h5py project already has an excellent pythonic API for accessing groups, subgroups and attributes. H5pom will tend to emphasize a schema where the schema is a list of class types which may appear in the HDF5 file and the attributes on those objects are known at the point of writing the python model implementation.

Object Mapping Basics

H5pom heavily utilizes the group and attribute notions in HDF5. The following general correspondences illustrate the design:

  • python objects <-> groups

  • scalar attributes of python objects <-> group attributes

  • object references of python objects <-> subgroup with-in parent object group

  • array type as attribute of python object <-> dataset with-in parent object group

  • python list as attribute of python object <-> family of attributes with prefixed names with-in parent object group

TODO

  • SubObjectDict saves dictionaries mapping alphanumeric names to Objects, but there is no persistent dictionaries mapping alphanumeric names to Scalars. Consider adding this.

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

h5pom-0.1.tar.gz (9.7 kB view hashes)

Uploaded Source

Built Distribution

h5pom-0.1-py2.7.egg (20.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page