mhcflurry

MHC Binding Predictor

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python
Topic
- Scientific/Engineering :: Bio-Informatics

Project description

mhcflurry

Open source neural network models for peptide-MHC binding affinity prediction

The adaptive immune system depends on the presentation of protein fragments by MHC molecules. Machine learning models of this interaction are used in studies of infectious diseases, autoimmune diseases, vaccine development, and cancer immunotherapy.

MHCflurry currently supports peptide / MHC class I affinity prediction using one model per MHC allele. The predictors may be trained on data that has been augmented with data imputed based on other alleles (see Rubinsteyn 2016). We anticipate adding additional models, including pan-allele and class II predictors.

You can fit MHCflurry models to your own data or download trained models that we provide. Our models are trained on data from IEDB and Kim 2014. See here for details on the training data preparation. The steps we use to train predictors on this data, including hyperparameter selection using cross validation, are here.

The MHCflurry predictors are implemented in Python using keras.

Setup

Install the package:

pip install mhcflurry

Then download our datasets and trained models:

mhcflurry-downloads fetch

From a checkout you can run the unit tests with:

nosetests .

Making predictions

from mhcflurry import predict
predict(alleles=['A0201'], peptides=['SIINFEKL'])

  Allele   Peptide  Prediction
0  A0201  SIINFEKL  10672.347656

Training your own models

This unit test gives a simple example of how to train a predictor in Python. There is also a script called mhcflurry-class1-allele-specific-cv-and-train that will perform cross validation and model selection given a CSV file of training data. Try mhcflurry-class1-allele-specific-cv-and-train --help for details.

Details on the downloaded class I allele-specific models

Besides the actual model weights, the data downloaded with mhcflurry-downloads fetch also includes a CSV file giving the hyperparameters used for each predictor. Another CSV gives the cross validation results used to select these hyperparameters.

To see the hyperparameters for the production models, run:

open "$(mhcflurry-downloads path models_class1_allele_specific_single)/production.csv"

To see the cross validation results:

open "$(mhcflurry-downloads path models_class1_allele_specific_single)/cv.csv"

Environment variables

The path where MHCflurry looks for model weights and data can be set with the MHCFLURRY_DOWNLOADS_DIR environment variable. This directory should contain subdirectories like “models_class1_allele_specific_single”. Setting this variable overrides the other environment variables described below.

If you only want to change the version of the released data used, you can set MHCFLURRY_DOWNLOADS_CURRENT_RELEASE. If you want to change the base directory used for all releases, set MHCFLURRY_DATA_DIR.

By default, MHCFLURRY_DOWNLOADS_DIR is a platform specific application storage directory, MHCFLURRY_DOWNLOADS_CURRENT_RELEASE is the latest release, and MHCFLURRY_DOWNLOADS_DIR is set to $MHCFLURRY_DATA_DIR/$MHCFLURRY_DOWNLOADS_CURRENT_RELEASE.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python
Topic
- Scientific/Engineering :: Bio-Informatics

Release history Release notifications | RSS feed

2.1.1

Mar 15, 2024

2.1.0

Oct 18, 2023

2.0.6

Jun 8, 2022

2.0.5

Nov 30, 2021

2.0.4

Sep 24, 2021

2.0.3

Sep 24, 2021

2.0.2

Jun 5, 2021

2.0.1

Jul 20, 2020

2.0.0

Jul 13, 2020

1.6.1

May 1, 2020

1.6.0

Mar 23, 2020

1.4.3

Nov 11, 2019

1.4.2

Oct 30, 2019

1.4.1

Oct 29, 2019

1.4.0

Oct 5, 2019

1.3.1

Sep 10, 2019

1.3.0

Sep 10, 2019

1.2.4

Apr 10, 2019

1.2.3

Feb 15, 2019

1.2.2

May 21, 2018

1.2.1

Mar 19, 2018

1.2.0

Feb 26, 2018

1.1.0

Feb 6, 2018

1.0.0

Dec 22, 2017

0.9.2

Oct 13, 2017

0.9.1

Jul 31, 2017

0.9.0

May 25, 2017

0.2.0

Mar 24, 2017

This version

0.0.8

Sep 17, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mhcflurry-0.0.8.tar.gz (48.4 kB view hashes)

Uploaded Sep 17, 2016 Source

Hashes for mhcflurry-0.0.8.tar.gz

Hashes for mhcflurry-0.0.8.tar.gz
Algorithm	Hash digest
SHA256	`200afb6038cab6b151cdf19a183186b134df614f6607ca48f70f25ab24d65d03`
MD5	`63c24e07765b87ad140ec295fb449e3e`
BLAKE2b-256	`e5cb3fa7a07b33cf3df2797a7579428262307c08b553fdedfa14daa21ba4c789`