mofax

Load and interpret MOFA models

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

MOFA+ in Python

Work with trained factor models in Python.

This library provides convenience functions to load and visualize factor models trained with MOFA+ in Python. For more information on the multi-omics factor analysis v2 framework please see this GitHub repository.

Getting started

Installation

pip install git+https://github.com/gtca/mofax

Training a factor model

Please see the MOFA+ GitHub repository for more information on training the factor models with MOFA+.

Loading the model

Import the module and create a connection to the HDF5 file with the trained model:

import mofax as mfx

model = mfx.mofa_model("trained_mofaplus_model.hdf5")

The connection is created in the readonly mode by default and can be terminated by calling the close() method on the model object at the end of the working session:

model.close()

Model object

Model object is an instance of a mofa_model class that wraps around the HDF5 connection and provides a simple way to address the parts of the trained model such as expectations for factors and for their loadings (weights) eliminating the need to traverse the HDF5 file manually.

Model methods

Simple data structures (e.g. lists or dictionaries) are typically returned upon calling the properties of the mofa model, e.g. model.shape:

model.shape
# returns (10138, 1124)
#         cells^  ^features

More complex structures are typically returned when using methods such as model.get_cells() to get cell -> group assignment as a pandas.DataFrame while also providing the way to only get this information for specific groups or views of the model.

model.get_cells().head()
# returns a pandas.DataFrame object:
# 	group	cell
# 0	T_CD4	AATCCTGCACATCGCC-1
# 1	T_CD4	AAGACGTGTGATGCCC-1
# 2	T_CD4	AAGGAGCGTCGGCATG-1
# 3	T_CD4	AATCCGTCACGAGACG-1
# 4	T_CD4	ACACCGAGGAGGTTGA-1

Use model.metadata to get the metadata table — it's a shorhand for samples_metadata, there's also features_metadata available.

model.metadata.head()
# returns a pandas.DataFrame object:
#                     group  n_genes
# sample
# AATCCTGCACATCGCC-1  T_CD4     1087
# AAGACGTGTGATGCCC-1  T_CD4     1836
# AAGGAGCGTCGGCATG-1  T_CD4     2216
# AATCCGTCACGAGACG-1  T_CD4     1615
# ACACCGAGGAGGTTGA-1  T_CD4     1800

To get expectations of W (weights) and Z (factors) matrices, use get_weights() and get_factors(), respectively. There's also a df=True option to get expectations as a Pandas data frame rather than a NumPy 2D array.

model.get_factors(factors=range(3), df=True).head()
#                      Factor1   Factor2   Factor3
# AATCCTGCACATCGCC-1  0.012582 -0.093512 -0.011228
# AAGACGTGTGATGCCC-1  0.001091 -0.027217 -0.011331
# AAGGAGCGTCGGCATG-1 -0.015097  0.093493 -0.010593
# AATCCGTCACGAGACG-1 -0.046222  0.225920  0.010083
# ACACCGAGGAGGTTGA-1  0.011766 -0.055964 -0.011298

Utility functions

A few utility functions such as calculate_factor_r2 to calculate the variance explained by a factor are provided as well.

Plotting functions

A few basic plots can be constructed with plotting functions provided such as plot_factors and plot_weights. They rely on and limited by plotting functionality of Seaborn.

Please check the notebooks for detailed examples. Some of the implemented plots are demonstrated below.

mofax plots

Contributions

In case you work with MOFA+ models in Python, you might find mofax useful. Please consider contributing to this module by suggesting the missing functionality to be implemented in the form of issues and in the form of pull requests.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.3.6

Aug 3, 2021

This version

0.3.2

Sep 22, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mofax-0.3.2.tar.gz (22.4 kB view hashes)

Uploaded Sep 22, 2020 Source

Built Distribution

mofax-0.3.2-py3-none-any.whl (21.2 kB view hashes)

Uploaded Sep 22, 2020 Python 3

Hashes for mofax-0.3.2.tar.gz

Hashes for mofax-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`d9e1efca133ae788199e1953e5b9ab62d9aa6d4e381f998363e1c78f63beec71`
MD5	`6cbb62b21a91caede00f553fda9d5899`
BLAKE2b-256	`da065fc5b5385af660171d7d75ec6a9c9d5c1c5747e8822f9ec27155138122a2`

Hashes for mofax-0.3.2-py3-none-any.whl

Hashes for mofax-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d003ae5c11818eba5b89e6470dcb88bc9708d71d41211ad3fd4df0f24c8b9c54`
MD5	`68c6195ba0f3bdd54cfc4f38621411fc`
BLAKE2b-256	`eeef4c96a651c6fd968a4cbfbb855faf06f3d9bf3095066b4540673f8d6ec0a5`