Skip to main content

A useful module for alligning cell lines to tumors

Project description

Celligner

Celligner is a computational project to align multiple cancer datasets across sequencing modalities, tissue conditions (media, perturbations..) and format (CL/tumor/organoids/spheroids)

See our latest paper on aligning CCLE cell lines with TCGA tumors: 2020 paper

Just want a quick look?

Go here for the production version: https://depmap.org/portal/celligner/

Go here for some usage examples: https://raw.githack.com/broadinstitute/celligner/master/docs/example.html

Install

TO see the old R package installation instruction, see the R/folder.

Before running pip, make sure that you have R installed.

pip install celligner

Even with R, some platform might not have all the required packages already installed (thanks R for being so easy to work with!)

In that case, please refer to our docker image:s

A dockerized version is available at jkobject:pycelligner

For developers

see CONTRIBUTING.md

Use Celligner

See docs/Celligner_demo.[html|pdf] for an example of usage. (view here)

celligner works like most scikit learn tool.

A user fits a dataset (e.g. CCLE tpm expression),

from celligner import Celligner

my_alligner = Celligner(make_plots=True)
my_alligner.fit(CCLE_expression, CCLE_annotation)

and then transforms another one based on this fitted dataset

my_alligner.method = "mnn_marioni"
my_alligner.mnn_kwargs = {'k1': 5, 'k2': 50, 'cosine_norm': True, "fk":5}
transformed_TCGA = my_alligner.transform(TCGA_expression, TCGA_annotation)

my_alligner.plot(color_column="tissue_type", colortable=TISSUE_COLOR, umap_kwargs={'n_neighbors': 15,'min_dist': 0.2, 'metric': 'cosine'})

Users can access other methods such as save(), load(), addToFit(), etc, as well as many data in values: pca_transform, transform_clusters, differential_genes_names, mnn_pairs, etc.

Please have a look at docs/Celligner_demo.[html|pdf] for an example of usage. (view here)

Computational complexity

Depending on the dataset, Celligner can be quite memory hungry. for TCGA, expect at least 50-60Gb of memory being used. You might need a powerfull commputer, lots of swap and to increase R's default maximum allowed memory.

You can also use the low_memory=True option to reduce the memory used by celligner in the memory intensive PCA & cPCA methods.

Add your own data to a pretrained model

If you want to see your dataset in celligner, you can decide to use our own prefitted version.

! curl https://storage.googleapis.com/celligner/model.pkl -output temp/model.pkl

from celligner import Celligner

my_alligner = Celligner()
my_alligner.load('temp/model.pkl')

We fit the model with CCLE and then transform TCGA. But you can decide differently.

For example: If you want to see how some of your newly sequenced tumors mapped to the CCLE (and TCGA) dataset, just load the model as displayed above and then run:

my_alligner.addTotransform(your_tpm, your_annotations)
my_alligner.plot()

This way you will not rerun the entire model.

See docs/Celligner_demo.[html|pdf] for other examples of usage. (view here)

Multidataset alignment

Dee docs/Celligner_demo.[html|pdf] for an example of usage. (view here)

One can use addToFit(), addToPredict() depending on whether they want to align their dataset to another or align another dataset to theirs.

If you have a very small dataset and want to align to CCLE or CGA, use the parameter doAdd=True in order to not rerun the entire pipeline and use cached information.

R Celligner

For the original R version of celligner, please check the R/README.md file here: https://github.com/broadinstitute.org/celligner/tree/master/R/README.md


Please use github issues for any problem related to the tool.

Maintainers:

Jérémie Kalfon @jkobject

Javad Noorbak @jnoorbak

James McFarland

Created by:

Allie Warren @awarren

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

celligner-1.0.0.tar.gz (21.9 kB view hashes)

Uploaded Source

Built Distribution

celligner-1.0.0-py3-none-any.whl (19.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page