Skip to main content

FASTQ-to-analysis-ready-CRAM Workflow Executor for Human Genome Sequencing

Project description

ftarc

FASTQ-to-analysis-ready-CRAM Workflow Executor for Human Genome Sequencing

wercker status

  • Input:
    • read1/read2 FASTQ files from Illumina DNA sequencers
  • Workflow:
    • Trim adapters
    • Map reads to a human reference genome
    • Mark duplicates
    • Apply BQSR (Base Quality Score Recalibration)
  • Output:
    • analysis-ready CRAM files

Installation

$ pip install -U https://github.com/dceoy/ftarc/archive/main.tar.gz

Dependent commands:

  • pigz
  • pbzip2
  • bgzip
  • tabix
  • samtools
  • java
  • gatk
  • cutadapt
  • fastqc
  • trim_galore
  • bwa or bwa-mem2

Docker image

Pull the image from Docker Hub.

$ docker image pull dceoy/ftarc

Usage

  1. Download hg38 resource data.

    $ ftarc download --dest-dir=/path/to/download/dir
    
  2. Write input file paths and configurations into ftarc.yml.

    $ ftarc init
    $ vi ftarc.yml  # => edit
    

    Example of ftarc.yml:

    ---
    reference_name: hs38DH
    adapter_removal: true
    metrics_collectors:
      fastqc: true
      picard: true
      samtools: true
    resources:  # These files can be downloaded with `ftarc download`.
      ref_fa: /path/to/GRCh38_full_analysis_set_plus_decoy_hla.fa
      known_sites_vcf:
        - /path/to/Homo_sapiens_assembly38.dbsnp138.vcf.gz
        - /path/to/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
        - /path/to/Homo_sapiens_assembly38.known_indels.vcf.gz
    runs:
      - fq:
          - /path/to/sample01.WGS.R1.fq.gz
          - /path/to/sample01.WGS.R2.fq.gz
      - fq:
          - /path/to/sample02.WGS.R1.fq.gz
          - /path/to/sample02.WGS.R2.fq.gz
      - fq:
          - /path/to/sample03.WGS.R1.fq.gz
          - /path/to/sample03.WGS.R2.fq.gz
        read_group:
          ID: FLOWCELL-1
          PU: UNIT-1
          SM: sample03
          PL: ILLUMINA
          LB: LIBRARY-1
    
  3. Create analysis-ready CRAM files from FASTQ files

    $ ftarc run --yml=ftarc.yml --workers=2
    

Run ftarc --help for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ftarc-0.1.0.tar.gz (19.8 kB view hashes)

Uploaded Source

Built Distribution

ftarc-0.1.0-py3-none-any.whl (27.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page