Map genes and genome to the Global Microbial Gene Catalog (GMGC)
Project description
GMGC-mapper
Command line tool to query the Global Microbial Gene Catalog (GMGC).
Install
GMGC-mapper runs on Python 3.6-3.8 and requires prodigal to be available for genome mode.
Install from source
python setup.py install
Examples
- Input is a genome sequence.
gmgc-mapper -i input.fasta -o output
- Input is DNA/protein gene sequences
gmgc-mapper --nt-genes genes.fna --aa-genes genes.faa -o output
The nucleotide input is optional (but should be used if available so that the quality of the hits can be refined):
gmgc-mapper --aa-genes genes.faa -o output
If yout input is a metagenome, you can use NGLess for assembly and gene prediction. For more details, read the docs.
Output
The output folder will contain
- Outputs of gene prediction (prodigal).
- Complete data table, listing all the hits in GMGC, per gene.
- Complete table, listing all the genome bins (MAGs) that are found in the results.
- Human readable summary.
For more details, read the docs. A description of the outputs is also written to output folder for convenience.
Parameters
-
-i/--input
: path to the input genome file(.fasta/.gz/.bz2). -
-o/--output
: Output directory (will be created if non-existent). -
--nt-genes
: path to the input DNA gene file(.fasta/.gz/.bz2). -
--aa-genes
: path to the input Protein gene file(.fasta/.gz/.bz2).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.