Skip to main content

A pipeline for binning metagenomic datasets from metaHiC data.

Project description

metaTOR

PyPI version PyPI - Python Version Build Status Docker Cloud Build Status Read the docs License: GPLv3 Code style: black

Metagenomic Tridimensional Organisation-based Reassembly - A set of scripts that streamlines the processing and binning of metagenomic metaHiC datasets.

Installation

Requirements:

  • Python 3.6 or later is required.
  • The following librairies are required but will be automatically installed with the pip installation: numpy, scipy, sklearn, pandas, docopt, networkx biopython pyfastx and pysam.
  • The following software should be installed separetely if you used the pip installation:

Using pip:

   pip3 install metator

or, to use the latest version:

   pip3 install -e git+https://github.com/koszullab/metator.git@master#egg=metator

In order to use Louvain or Leiden it's necessary to set a global variable LOUVAIN_PATH and LEIDEN_PATH depending on which algorithm you wan to use with the absolute path where the executable are.

For Louvain algorithm in the directory where you have the archive file (available in the external directory of this repository):

YOUR_DIRECTORY=$(pwd)
tar -xvzf louvain-generic.tar.gz
cd gen-louvain
make
export LOUVAIN_PATH=$YOUR_DIRECTORY/gen-louvain/

For Leiden algorithm, clone the networkanalysis repository from github and build the Java script. Then you can export the Leiden path:

export LEIDEN_PATH=/networkanalysis_repository_path/build/libs/networkanalysis-1.2.0.jar

Using docker container:

A dockerfile is also available if that is of interest. You may fetch the image by running the following:

    docker pull koszullab/metator

Usage

    metator {network|partition|validation|pipeline} [parameters]

A metaTOR command takes the form metator action --param1 arg1 --param2 arg2 #etc.

There are three actions/steps in the metaTOR pipeline, which must be run in the following order:

  • network : Generate metaHiC contigs network from fastq reads or bam files and normalize it.

  • partition : Perform the Louvain or Leiden community detection algorithm many times to bin contigs together according to the metaHiC signal between contigs.

  • validation : Use CheckM to validate the bins, then do a recursive decontamination step to remove contamination.

After the last step is completed there should be a set of bins and a table with various descriptors of the bins.

There are a number of other, optional, miscellaneous actions:

  • pipeline : Run all three of the above actions sequentially or only some of them depending on the arguments given. This can take a while.

  • version : display current version number.

  • help : display help message.

References

Contact

Authors

Research lab

Spatial Regulation of Genomes (Institut Pasteur, Paris)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metator-1.0.1.tar.gz (34.3 kB view hashes)

Uploaded Source

Built Distribution

metator-1.0.1-py3-none-any.whl (50.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page