Skip to main content

A library for computing samplings in arbitrary dimensions

Project description

Latest Version on PyPI PyPI downloads Code Quality Test Results Test Suite Results CodeFactor Coveralls ReadTheDocs Pyup This code is formatted in black This code has its imports sorted with isort BSD 3-Clause License

A Python library for generating space-filling sample sets of low to moderate dimensional data from domains including:

  • Euclidean space

  • Grassmannian atlas

  • Surface of an n-Sphere

A Collection of Space-filling Sampling Designs for Arbitrary Dimensions. The API is structured such that the top level packages represent the shape of the domain you are interested in:

  • ball - The n-dimensional solid unit ball

  • directional - The space of unit length directions in n-dimensional space. You can also consider this a sampling of the boundary of the n-dimensional unit ball.

  • hypercube - The n-dimensional solid unit hypercube \(x \\in [0,1]^n\).

  • subspace - Sampling a n-1-dimensional subspace orthogonal to a unit vector or sampling the Grassmanian Atlas of projections from a dimension n to a lower dimension m.

  • shape - a collection of (n-1)-manifold and non-manifold shapes embedded in an n dimensional space. For now these must all be sampled using a uniform distribution.

Within each module is a list of ways to fill the space of the samples. Note, that not all of the methods listed below are applicable to the modules listed above. They include:

  • Uniform - a random, uniform distribution of points (available for ball, directional, hypercube, subspace, and shape)

  • Normal - a Gaussian distribution of points (available for hypercube)

  • Multimodal - a mixture of Gaussian distributions of points (available for hypercube)

  • CVT - an approximate centroidal Voronoi tessellation of the points constrained to the given space (available for hypercube and directional)

  • LHS - a Latin hypercube sampling design of points constrained to the space (available for hypercube)

The python CVT code is adapted from a C++ implementation provided by Carlos Correa. The Grassmannian sampler is adapted from code from Shusen Liu.

Installation

A preliminary version is available on PyPI:

pip install samply

Otherwise, you can download the repository for the most cutting edge additions:

git clone https://github.com/maljovec/samply.git
cd samply
python setup.py [build|develop|install]

Usage

You can use the library from python such as the examples below:

import samply

direction_samples = samply.directional.uniform(10000, 2)
ball_samples = samply.ball.uniform(10000, 2)
scvt_samples = samply.directional.cvt(10000, 2)
cvt_samples = samply.hypercube.cvt(10000, 2)

projection_samples = samply.subspace.grassmannian(10000, 3, 2)

The *samples variables will be NxD matrices where N is the number of samples requested and D is the dimensionality of the sampler or the requested dimensionality.

Testing

The test suite can be run through the setup script:

python setup.py test

Example

To test drive a subset of the different samplers in action, check out this little web app hosted on the Google Cloud Platform which is using samply under the covers. Note, the CVT is still rather inefficient for larger sample sizes.

What’s Next

Forthcoming:
  • Improved documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

samply-0.1.2.tar.gz (18.5 kB view hashes)

Uploaded Source

Built Distribution

samply-0.1.2-py2.py3-none-any.whl (15.1 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page