Skip to main content

Worm tracker Commons Object Notation

Project description

See https://github.com/openworm/tracker-commons
# Python implementation of Tracker Commons

This project contains source code in Python that implements the WCON data format for tracking data.

*When the Python specification and the behavior of the Scala implementation differ, and it is not obvious which one is correct, the behavior of the Scala implementation should be presumed to be authoritative.*

### Installation

`wcon` is [registered with the Python Package Index](https://pypi.python.org/pypi/wcon/), so just enter this command from any shell:

```
pip install wcon
```

Any problems? Visit the more detailed [installation guide](INSTALL.md).

### Usage

```
import wcon

# Open a worm file, convert it to canonical form, and save it
# (actually it's automatically converted to canon before
# saving, but here we do so explicitly)
w = wcon.WCONWorms.load_from_file('tests/minimax.wcon')
canon_w = w.to_canon
canon_w.save_to_file('test.wcon', pretty_print=True)

# From a string literal:
from io import StringIO
w2 = WCONWorms.load(StringIO('{"units":{"t":"s","x":"mm","y":"mm"}, '
'"data":[]}'))

# WCONWorms.load_from_file accepts any valid WCON, but .save_to_file
# output is always "canonical" WCON, which makes specific choices about
# how to arrange and format the WCON file. This way the functional
# equality of any two WCON files can be tested by this:

w1 = WCONWorms.load_from_file('file1.wcon')
w2 = WCONWorms.load_from_file('file2.wcon')

assert(w1 == w2)

# or:

w1.save_to_file('file1.wcon')
w2.save_to_file('file2.wcon')

import filecmp
assert(filecmp.cmp('file1.wcon', file2.wcon'))

w3 = w1 + w2 # Merge the two. An exception is raised if the data clashes
```

### Formal API

- Class `WCONWorms`
- methods
- `load_from_file`
- [class method]
- parameters:
- `JSON_path`, a `str`, the path of the file
- `save_to_file`
- parameters:
- `JSON_path`, a `str`, the path of the file
- `compress_file`, a boolean, whether to compress the file
- `pretty_print`, a boolean, whether to render the output on multiple lines
- `to_canon`
- [property]
- returns: a copy of this object but in canonical form
- `__add__`
- [use `+`]
- Merges WCONWorms objects together. If the worm IDs or time periods are disjoint, or if the data agrees, this method works. If not, an exception is thrown.
- `__eq__`
- [use `==`]
- Return a boolean indicating whether two WCONWorms objects are the same, after conversion of all quantities to canonical units. Compares both data and metadata.
- attributes
- `units`: dict
- May be empty, but is never None since 'units' is required
to be specified.
- `metadata`: dict
- If 'metadata' was not specified, metadata is None.
- The values in this dict might be nested into further dicts or other
data types.
- `data`: Pandas DataFrame or None
- If 'data' was not specified, data is None.
- All the worms are merged into one DataFrame
- Expensive and memory-intensive for objects containing many worms, so try to avoid using this. Use `data_as_odict` instead.
- [Note: `files`, if present in the input, is not persisted unless the `.load` factory method is used.]
- `num_worms`: int
- `worm_ids`: list
- `data_as_odict`: OrderedDict of pandas DataFrames, keyed by worm ID
- This is the native representation of the data in this object, and thus the fastest to load. Try to use this instead of `data`.
- Class `MeasurementUnit`
- Note: this class does not need to be used publicly, but it can be if desired.
- consequently it can be ommitted from a public API
- methods
- `create`
- [class method]
- Factory method
- parameter: `unit_string`, a `str`, the unit expression (e.g. `"mm"` or `"cm/s"` or `"C"`)
- returns: an instance of this class
- `to_canon`
- transforms `v` from original units to canonical units
- parameter: `v` (a `float`)
- returns: float
- `from_canon`
- the inverse of `to_canon`
- attributes
- `unit_string`: str
- The original string (e.g. `"m/s^2"`)
- `canonical_unit_string`: str
- The canonical form for all units within the original string (e.g. `"mm/s^2"`)

### Custom WCON objects

Any top-level key other than the basic:

- files
- units
- metadata
- data

...are ignored. It is convenient, but not required, to follow the convention of beginning custom fields with the prefix `"@"`. Handling custom objects requires subclassing `WCONWorms`.


### WCON parser: proof of concept

Thanks to the Python libraries `json` and `jsonschema`, it is relatively trivial to parse and validate a WCON file. Here's an example of how one might accomplish this, without even using the `wcon` package:

import json, jsonschema

# The WCON schema
with open("wcon/wcon_schema.json", "r") as wcon_schema_file:
schema = json.loads(wcon_schema_file.read())

# Our example WCON file
JSON_path = '../../tests/minimax.wcon'
with open(JSON_path, 'r') as infile:
serialized_data = infile.read()

# Load the whole JSON file into a nested dict.
w = json.loads(serialized_data)

# Validate the raw file against the WCON schema
jsonschema.validate(w, schema)

With the above code we end up with a nested dictionary `w` containing everything that was serialized in the `minimax.wcon` file.

### The `wcon` Python package

Using this `wcon` Python package, something similar can be accomplished:

import wcon

w = wcon.WCONWorms.load_from_file('../../tests/minimax.wcon')

Here, instead of being a nested dictionary, `w` is a `WCONWorms` object that is more powerful. Here are some of the additional things that can be accomplished with the `WCONWorms` object:

- The WCON file is validated not just against the WCON schema, but also to ensure units are valid, that every data key has a corresponding unit, and that every data segment has "aspects" of the same length. (e.g. if a skeleton at time `1.3` has 45 `x`-coordinates, it should also have 45 `y`-coordinates. This condition is not expressible in a JSON schema but it is validated programatically by the WCONWorms initializer.
- Units are expressed as `MeasurementUnit` objects, which can be compared with other such objects, to verify that "mm" and "millimetres" refer to the same units, for instance. (see the below section for more details)
- WCONWorms objects can have their data be converted into canonical units, and then saved again.
- WCONWorms objects can be loaded from multiple files and combined together, via the `"files"` object.
- Worm data recorded in multiple "tracks", or elements, in the `"data"` object, can have such tracks merged.
- Worm data can be extracted in a Pandas DataFrame format for easier downstream processing, since the dimensions of the data have been placed into one two-dimensional array, rather than in a nested array.
- WCONWorms can be subclassed by labs implementing "special features", in two places: ("type 1") top-level objects starting with `"@"` or ("type 2") objects within individual `"data"` array items starting with `"@"`.


### MeasurementUnit

The WCON format requires a `"units"` object, where you specify in what units your quantities are being measured. `WCONParser` represents these units internally as `MeasurementUnit` objects. With `MeasurementUnit`, you can convert from any supported unit expression to the canonical one:

>>> MeasurementUnit.create('m')
MeasurementUnit, original form: 'm' canonical form: 'mm'
>>> u = MeasurementUnit.create('m')
>>> u.to_canon(1)
1000.0
>>> u.from_canon(100)
0.1
>>> u = MeasurementUnit.create('F')
>>> u.to_canon(72)
22.222222222222221
>>> u = MeasurementUnit.create('m/min')
>>> u.canonical_unit_string
'mm/s'
>>> u.to_canon(5)
83.33333333333334
>>> u = MeasurementUnit.create('m^2')
>>> u.to_canon(1)
1000000.0

You can also check the equality of various unit expressions. For example, all of these expressions will evaluate to `True`:

MeasurementUnit.create('mm') == MeasurementUnit.create('millimetre')
MeasurementUnit.create('Mm') == MeasurementUnit.create('megametre')
MeasurementUnit.create('mm') != MeasurementUnit.create('Mm')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wcon-1.1.0.tar.gz (29.1 kB view hashes)

Uploaded Source

Built Distribution

wcon-1.1.0-py3-none-any.whl (34.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page