Skip to main content

VTAM - Validation and Taxonomic Assignation of Metabarcoding Data is a metabarcoding pipeline. The analysis starts from high throughput sequencing (HTS) data of amplicons of one or several metabarcoding markers and produces an amplicon sequence variant (ASV) table of validated variants assigned to taxonomic groups.

Project description

https://img.shields.io/pypi/v/vtam.svg?color=blue https://img.shields.io/pypi/pyversions/vtam.svg https://static.pepy.tech/personalized-badge/vtam?period=month&units=international_system&left_color=gray&right_color=blue&left_text=Downloads https://codecov.io/gh/aitgon/vtam/branch/master/graph/badge.svg https://readthedocs.org/projects/vtam/badge/?version=latest https://app.travis-ci.com/aitgon/vtam.svg?branch=master https://github.com/aitgon/vtam/workflows/CI/badge.svg

VTAM is a metabarcoding package with various commands to process high throughput sequencing (HTS) data of amplicons of one or several metabarcoding markers in FASTQ format and produce a table of amplicon sequence variants (ASVs) assigned to taxonomic groups. If you use VTAM in scientific works, please cite the following article:

González, A., Dubut, V., Corse, E., Mekdad, R., Dechatre, T. and Meglécz, E.. VTAM: A robust pipeline for processing metabarcoding data using internal controls. bioRxiv: 10.1101/2020.11.06.371187v1.

Commands for a quick installation:

conda create --name vtam python=3.9 -y
conda activate vtam

Then install dependencies

python3 -m pip install cutadapt
conda install -c bioconda blast -y
conda install -c bioconda vsearch -y
python3 -m pip install vtam

Commands for a quick working example:

vtam example
cd example
snakemake --printshellcmds --resources db=1 --snakefile snakefile.yml --cores 4 --configfile asper1/user_input/snakeconfig_mfzr.yml --until asvtable_taxa

The table of amplicon sequence variants (ASV) is here:

(vtam) user@host:~/vtam/example$ head -n4 asper1/run1_mfzr/asvtable_default_taxa.tsv
run marker  variant sequence_length read_count      tpos1_run1      tnegtag_run1    14ben01 14ben02 clusterid       clustersize     chimera_borderlineltg_tax_id    ltg_tax_name    ltg_rank        identity        blast_db        phylum  class   order   family  genus   species sequence
run1        MFZR    25      181     478     478     0       0       0       25      1       False   131567  cellular organisms      no rank 80      coi_blast_db_20200420                                                   ACTATACCTTATCTTCGCAGTATTCTCAGGAATGCTAGGAACTGCTTTTAGTGTTCTTATTCGAATGGAACTAACATCTCCAGGTGTACAATACCTACAGGGAAACCACCAACTTTACAATGTAATCATTACAGCTCACGCATTCCTAATGATCTTTTTCATGGTTATGCCAGGACTTGTT
run1        MFZR    51      181     165     0       0       0       165     51      1       False                                   coi_blast_db_20200420           ACTATATTTAATTTTTGCTGCAATTTCTGGTGTAGCAGGAACTACGCTTTCATTGTTTATTAGAGCTACATTAGCGACACCAAATTCTGGTGTTTTAGATTATAATTACCATTTGTATAATGTTATAGTTACGGGTCATGCTTTTTTGATGATCTTTTTTTTAGTAATGCCTGCTTTATTG
run1        MFZR    88      175     640     640     0       0       0       88      1       False   1592914 Caenis pusilla  species 100     coi_blast_db_20200420   Arthropoda      Insecta Ephemeroptera   Caenidae        Caenis  Caenis pusilla  ACTATATTTTATTTTTGGGGCTTGATCCGGAATGCTGGGCACCTCTCTAAGCCTTCTAATTCGTGCCGAGCTGGGGCACCCGGGTTCTTTAATTGGCGACGATCAAATTTACAATGTAATCGTCACAGCCCATGCTTTTATTATGATTTTTTTCATGGTTATGCCTATTATAATC

The database of intermediate data is here:

 (vtam) user@host:~/vtam/example$ sqlite3 asper1/db.sqlite '.tables'
FilterChimera                    Sample
FilterChimeraBorderline          SampleInformation
FilterCodonStop                  SortedReadFile
FilterIndel                      TaxAssign
FilterLFN                        Variant
FilterLFNreference               VariantReadCount
FilterMinReplicateNumber         wom_Execution
FilterMinReplicateNumber2        wom_FileInputOutputInformation
FilterMinReplicateNumber3        wom_Option
FilterPCRerror                   wom_TableInputOutputInformation
FilterRenkonen                   wom_TableModificationTime
Marker                           wom_ToolWrapper
ReadCountAverageOverReplicates   wom_TypeInputOrOutput
Run

The VTAM documentation is hosted at ReadTheDocs.

VTAM is maintained by Aitor González (aitor dot gonzalez at univ-amu dot fr) and Emese Meglécz (emese dot meglecz at univ-amu dot fr).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vtam-0.2.0.tar.gz (87.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page