Skip to main content

Maximum mean discrepancy comparisons single cell profiles

Project description

There is a classical problem in statistics known as the two-sample problem. In this setting, you are given discrete observations of two different distributions and asked to determine if the distributions are significantly different. A special univariate case of this problem is familiar to many biologists as a “difference in means” comparison – as performed using Student’s t-test.

This problem becomes somewhat more complex in high-dimensions, as differences between distributions may manifest not only in the mean location, but also in the covariance structure and modality of the data. One approach to comparing distributions in this setting leverages kernel similarity metrics to find the maximum mean discrepancy (Gretton et. al. 2012, JMLR) – the largest difference in the means of the distributions under a flexible transformation.

Here, we adapt the MMD method to compare cell populations in single cell measurement data. In the frame of the two-sample problem, each cell population of interest is considered as a distribution and each cell is a single observation from the source distribution. We use the MMD to compute (1) a metric of the magnitude of difference between two cell populations, and (2) a p-value for the significance of this difference.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scmmd-0.1.0.tar.gz (7.7 kB view hashes)

Uploaded Source

Built Distribution

scmmd-0.1.0-py3-none-any.whl (10.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page