Skip to main content

Tool for motif conservation analysis

Project description

MoCA: Tool for MOtif Conservation Analysis

https://img.shields.io/pypi/v/moca.svg https://img.shields.io/travis/saketkc/moca.svg https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg https://coveralls.io/repos/github/saketkc/moca/badge.svg?branch=master https://landscape.io/github/saketkc/moca/master/landscape.svg?style=flat https://requires.io/github/saketkc/moca/requirements.svg?branch=master

LICENSE

ISC

Installation

Requirements

  • pybedtools

  • biopython

  • pandas

  • scipy

  • statsmodels

  • pybigwig

  • seaborn

  • MEME==4.10.2

NOTE: MoCA also relies on fasta-shuffle-letters that was introduced in MEME 4.11.0 hence if you are using 4.10.2 make sure the fasta-shuffle-letters is the updated one.

For a sample script see travis/install_meme.sh

Using Conda

moca is most compatible with the conda environment.

$ conda config --add channels bioconda
$ conda install moca

Using pip

$ pip install moca

For development

$ git clone https://github.com:saketkc/moca.git
$ cd moca
$ conda env create -f environment.yml python=2.7
$ source activate mocadev
$ python setup.py install

Workflow

MoCA makes use of PhyloP/PhastCons/GERP scores to assess the quality of a motif, the hypothesis being a ‘true motif’ would evolve slower as compared to its surrounding(flanking sequences).

https://raw.githubusercontent.com/saketkc/moca_web/master/docs/abstract/workflow.png

Usage

$ moca
Usage: moca [OPTIONS] COMMAND [ARGS]...

  moca: Motif Conservation Analysis

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  find_motifs  Run meme to locate motifs and create...
  plot         Create stacked conservation plots

Motif analysis using MEME

MoCA can perform motif analysis for you given a bedfile containing ChIP-Seq peaks.

Genome builds and MEME binary locations are specified through a configuraton file. A sample configuration file is available: tests/data/application.cfg and should be self-explanatory.

moca find_motifs

$ moca find_motifs -h
Usage: moca find_motifs [OPTIONS]

  Run meme to locate motifs and create conservation stacked plots

Options:
  -i, --bedfile TEXT            Bed file input  [required]
  -o, --oc TEXT                 Output Directory  [required]
  -c, --configuration TEXT      Configuration file  [required]
  --slop-length INTEGER         Flanking sequence length  [required]
  --flank-motif INTEGER         Length of sequence flanking motif  [required]
  --n-motif INTEGER             Number of motifs
  -t, --cores INTEGER           Number of parallel MEME jobs  [required]
  -g, -gb, --genome-build TEXT  Key denoting genome build to use in
                                configuration file  [required]
  --show-progress               Print progress
  -h, --help                    Show this message and exit.

moca plot

$ moca plot -h
Usage: moca plot [OPTIONS]

  Create stacked conservation plots

Options:
  --meme-dir, --meme_dir TEXT     MEME output directory  [required]
  --centrimo-dir, --centrimo_dir TEXT
                                  Centrimo output directory  [required]
  --fimo-dir-sample, --fimo_dir_sample TEXT
                                  Sample fimo.txt  [required]
  --fimo-dir-control, --fimo_dir_control TEXT
                                  Control fimo.txt  [required]
  --name TEXT                     Plot title
  --flank-motif INTEGER           Length of sequence flanking motif
                                  [required]
  --motif INTEGER                 Motif number
  -o, --oc TEXT                   Output Directory  [required]
  -c, --configuration TEXT        Configuration file  [required]
  --show-progress                 Print progress
  -g, -gb, --genome-build TEXT    Key denoting genome build to use in
                                  configuration file  [required]
  -h, --help                      Show this message and exit.

Example

Most users will require using the command line version only:

$ moca find_motifs -i encode_test_data/ENCFF002DAR.bed\
    -c tests/data/application.cfg -g hg19 --show-progress

Creating plots if you already have run MEME and Centrimo:

$ moca plot -c tests/data/application.cfg -g hg19\
    --meme-dir moca_output/meme_out\
    --centrimo-dir moca_output/centrimo_out\
    --fimo-dir-sample moca_output/meme_out/fimo_out_1\
    --fimo-dir-control moca_output/meme_out/fimo_random_1\
    --name ENCODEID
http://www.saket-choudhary.me/moca/_static/img/ENCFF002CEL.png

There is also a structured API available, however it might be missing examples and documentation at places.

API Documentation

http://saketkc.github.io/moca/

Tests

moca is mostly extensively tested. See code-coverage.

Run tests locally

$ ./runtests.sh

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.4.1 (2017-02-16)

  • Added $delta$ options for sample vs control ttest

  • Removed unused requirements: ipaddress

  • Fixed meme runner to plot

0.3.3 (2016-10-03)

  • Removed pycairo dependency

0.2.9 (2016-05-31)

  • Do not fail silently on MEME failing

  • Support –cores to support parallel threads

0.2.8 (2016-05-30)

  • Fixed MEME pipeline missing from mocacli

0.2.7 (2016-05-30)

  • Fixed bug where missing wig keys were not handled in mocacli

0.2.4 (2016-05-29)

  • Cleaned up unused scripts under scripts directory

  • Add configuration file example

0.2.3 (2016-05-29)

  • Include package_dir in setup.py

  • Include requirements.txt in MANIFEST

0.2.0 (2016-05-29)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moca-0.4.3.tar.gz (45.2 kB view hashes)

Uploaded Source

Built Distribution

moca-0.4.3-py2.py3-none-any.whl (40.2 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page