A Python package for kernel methods in Statistics/ML.
Project description
PyRKHSstats
A Python package implementing a variety of statistical/machine learning methods that rely on kernels (e.g. HSIC for independence testing).
Overview
- Independence testing with HSIC (Hilbert-Schmidt Independence Criterion), as introduced in A Kernel Statistical Test of Independence, A. Gretton, K. Fukumizu, C. Hui Teo, L. Song, B. Schölkopf, and A. Smola (NIPS 2007).
- Measurement of conditional independence with HSCIC (Hilbert-Schmidt Conditional Independence Criterion), as introduced in A Measure-Theoretic Approach to Kernel Conditional Mean Embeddings, J. Park and K. Muandet (NeurIPS 2020).
- The Kernel-based Conditional Independence Test (KCIT), as introduced in Kernel-based Conditional Independence Test and Application in Causal Discovery, K. Zhang, J. Peters, D. Janzing, B. Schölkopf (UAI 2011).
- Two-sample testing (also known as homogeneity testing) with the MMD (Maximum Mean Discrepancy), as presented in A Fast, Consistent Kernel Two-Sample Test, A. Gretton, K. Fukumizu, Z. Harchaoui, and B. K. Sriperumbudur (NIPS 2009) and in A Kernel Two-Sample Test, A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola (JMLR, volume 13, 2012).
Resource | Description |
---|---|
HSIC | For independence testing |
HSCIC | For the measurement of conditional independence |
KCIT | For conditional independence testing |
MMD | For two-sample testing |
Implementations available
The following table details the implementation schemes for the different resources available in the package.
Resource | Implementation Scheme | Numpy based available | PyTorch based available |
---|---|---|---|
HSIC | Resampling (permuting the xi's but leaving the yi's unchanged) | Yes | No |
HSIC | Gamma approximation | Yes | No |
HSCIC | N/A | Yes | Yes |
KCIT | Gamma approximation | Yes | No |
KCIT | Monte Carlo simulation (weighted sum of χ2 random variables) | Yes | No |
MMD | Gram matrix spectrum | Yes | No |
In development
- Joint independence testing with dHSIC.
- Goodness-of-fit testing.
- Methods for time series models.
- Bayesian statistical kernel methods.
- Regression by independence maximisation.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
PyRKHSstats-2.1.0.tar.gz
(28.0 kB
view hashes)
Built Distribution
Close
Hashes for PyRKHSstats-2.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b130d5f14f3a98faf098489c59718ec09422612e2b6be05bd1b75a2655cd2677 |
|
MD5 | a142749a522e659826f6b38a5705a8bf |
|
BLAKE2b-256 | c4575c7054ed7e3b018f1166b57315bbc7722059b15fb05b03eb168a1ab28ecc |