Skip to main content

Extended arrays for working with scientific datasets in Python

Project description

xray is a Python package for working with aligned sets of homogeneous, n-dimensional arrays. It implements flexible array operations and dataset manipulation for in-memory datasets within the Common Data Model widely used for self-describing scientific data (e.g., the NetCDF file format).

Why xray?

Adding dimensions names and coordinate values to numpy’s ndarray makes many powerful array operations possible:

  • Apply operations over dimensions by name: x.sum('time').

  • Select values by label instead of integer location: x.loc['2014-01-01'] or x.labeled(time='2014-01-01').

  • Mathematical operations (e.g., x - y) vectorize across multiple dimensions (known in numpy as “broadcasting”) based on dimension names, regardless of their original order.

  • Flexible split-apply-combine operations with groupby: x.groupby('time.dayofyear').mean().

  • Database like aligment based on coordinate labels that smoothly handles missing values: x, y = xray.align(x, y, join='outer').

  • Keep track of arbitrary metadata in the form of a Python dictionary: x.attrs.

xray aims to provide a data analysis toolkit as powerful as pandas but designed for working with homogeneous N-dimensional arrays instead of tabular data. Indeed, much of its design and internal functionality (in particular, fast indexing) is shamelessly borrowed from pandas.

Because xray implements the same data model as the NetCDF file format, xray datasets have a natural and portable serialization format. But it’s also easy to robustly convert an xray DataArray to and from a numpy ndarray or a pandas DataFrame or Series, providing compatibility with the full PyData ecosystem.

For more about xray, see the project’s GitHub page and documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xray-0.1.0.tar.gz (66.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page