Skip to main content

A bioinformatic classifier of Rab GTPases

Project description

https://img.shields.io/pypi/v/rabifier.svg

Rabifier is an automated bioinformatic pipeline for prediction and classification of Rab GTPases. For more detailed description of the pipeline check the references. If you prefer just to browse Rab GTPases in all sequenced Eukaryotic genomes visit rabdb.org.

Rabifier is freely distributed under the GNU General Public License, check the LICENCE file for details.

Please cite our papers if you use Rabifier in your projects.

  • Rabifier2: an improved bioinformatic classifier of Rab GTPases. Surkont J, et al.

  • Thousands of Rab GTPases for the Cell Biologist. Diekmann Y, et al. PLoS Comput Biol 7(10): e1002217. doi:10.1371/journal.pcbi.1002217

Installation

To install Rabifier simply run

pip install rabifier

Python requirements, third party packages and other dependencies

Rabifier supports Python 2.7 and Python 3.4. Rabifier was tested only on a GNU/Linux operating system, we are not planning to support other platforms.

Rabifier depends on third-party Python libraries:

  • biopython (>=1.66)

  • numpy (>=1.10.1)

  • scipy (>=0.16.1)

Rabifier uses several bioinformatic tools, which are required for most of the classification stages. Ensure that the following programs (or links pointing to them) are available in the system path.

  • HMMER (3.1b1): phmmer, hmmbuild, hmmpress, hmmscan

  • BLAST+ (2.2.30): blastp

  • MEME4 (4.10.2): meme, mast

  • Superfamily (>=1.75): superfamily (NOTE: this is a folder containing several Superfamily database files and scripts, see below)

If you have cloned this repository you need to compile the HMMs of Rab subfamilies using hmmpress, i.e. run hmmpress rabifier/data/rab_subfamily.hmm

Rabifier requires a seed database for Rab classification. A precomputed database is a part of this repository. You can also create the database using rabifier-mkdb on the raw, manually curated data sets, available in a seperate repository https://github.com/evocell/rabifier-data. The build process requires additional software.

To install Superfamily database follow the instructions below (based on the Superfamily website).

# Register at the Superfamily website to get your username and password

# Download files
mkdir superfamily
cd superfamily
wget --http-user USERNAME --http-password PASSWORD -r -np -nd -e robots=off \
    -R 'index.html*' 'http://supfam.org/SUPERFAMILY/downloads/license/supfam-local-1.75/'
wget http://scop.mrc-lmb.cam.ac.uk/scop/parse/dir.cla.scop.txt_1.75 -O dir.cla.scop.txt
wget http://scop.mrc-lmb.cam.ac.uk/scop/parse/dir.des.scop.txt_1.75 -O dir.des.scop.txt

# Uncompress files
gzip -d *.gz
mv hmmlib_1.75 hmmblib

# Make Perl scripts executable
chmod u+x *.pl

# Build the HMM library
hmmpress hmmlib

# Create a symbolic link pointing to the database directory e.g. ln -s superfamily $HOME/bin/

Usage

To run Rab prediction on protein sequences, save sequences in the FASTA format and run:

rabifier sequences.fa

For more options controlling Rabifier behaviour type:

rabifier -h

Bug reports and contributing

Please use the issue tracker to report bugs and suggest improvements.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rabifier-2.0.2.tar.gz (6.7 MB view hashes)

Uploaded Source

Built Distribution

rabifier-2.0.2-py2.py3-none-any.whl (6.7 MB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page