NanoStat

Calculate statistics for Oxford Nanopore sequencing data and alignments

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Calculate various statistics from a long read sequencing dataset in fastq, bam or albacore sequencing summary format.

INSTALLATION

pip install nanostat
or
conda install -c bioconda nanostat

USAGE

NanoStat [-h] [-v] [-o OUTDIR] [-p PREFIX] [-n NAME] [-t N]
                [--barcoded] [--readtype {1D,2D,1D2}]
                (--fastq file [file ...] | --fasta file [file ...] | --summary file [file ...] | --bam file [file ...])

Calculate statistics of long read sequencing dataset.

General options:
  -h, --help            show the help and exit
  -v, --version         Print version and exit.
  -o, --outdir OUTDIR   Specify directory in which output has to be created.
  -p, --prefix PREFIX   Specify an optional prefix to be used for the output file.
  -n, --name NAME       Specify a filename/path for the output, stdout is the default.
  -t, --threads N       Set the allowed number of threads to be used by the script.

Input options.:
  --barcoded            Use if you want to split the summary file by barcode
  --readtype {1D,2D,1D2}
                        Which read type to extract information about from summary. Options are 1D, 2D,
                        1D2

Input data sources, one of these is required.:
  --fastq file [file ...]
                        Data is in one or more (compressed) fastq file(s).
  --fasta file [file ...]
                        Data is in one or more (compressed) fasta file(s).
  --summary file [file ...]
                        Data is in one or more (compressed) summary file(s)generated by albacore.
  --bam file [file ...]
                        Data is in one or more sorted bam file(s).

EXAMPLES:
  NanoStat --fastq reads.fastq.gz --outdir statreports
  NanoStat --summary sequencing_summary1.txt sequencing_summary2.txtsequencing_summary3.txt --readtype 1D2
  NanoStat --bam alignment.bam alignment2.bam

EXAMPLES

NanoStat --fastq reads.fastq.gz --outdir statreports
NanoStat --summary sequencing_summary1.txt sequencing_summary2.txt sequencing_summary3.txt --readtype 1D2
NanoStat --bam alignment.bam alignment2.bam

Example output

General summary:
Number of reads:    3995
Total bases:    11418359
Median read length: 1221.0
Mean read length:   2858.2
Read length N50:    8676
Active channels:    933
Mean read quality:  10.2
Median read quality:    10.6
Top 5 longest reads and their mean basecall quality score
1:  36928 (10.8, [a9dbd2b5-718c-4d0c-afa8-a12a54a5a12a])
2:  32830 (10.2, [b87fc717-1cf8-4526-9f96-3042fda5b769])
3:  30474 (12.4, [ea3e43d8-6cbf-4687-95bd-66e6123512d4])
4:  27531 (12.5, [74c0e08c-eb94-4825-b93b-21d63e05cf14])
5:  26535 (10.4, [8e6ed505-8477-4462-9f0a-3a72783cbf60])
Top 5 highest mean basecall quality scores and their read lengths
1:  14.8 (1040, [acf6f90b-ea22-4960-8049-6e6e694a3f9a])
2:  14.7 (9603, [ec796da1-5c4a-4350-974b-6dabb8deb546])
3:  14.6 (680, [792c485a-81cb-4ef7-8f23-01f10f9c7c23])
4:  14.5 (2664, [d8092ffb-9919-42fb-ad41-34b1658f1bd5])
5:  14.5 (909, [d55d3bf6-0729-4b46-82cd-0cef00bcf849])
Number and percentage of reads above quality cutoffs
>Q5:    3559 (89.1%)
>Q7:    3429 (85.8%)
>Q10:   2705 (67.7%)
>Q12:   1072 (26.8%)
>Q15:   0 (0.0%)

I welcome all suggestions, bug reports, feature requests and contributions. Please leave an issue or open a pull request. I will usually respond within a day, or rarely within a few days.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.6.0

Nov 29, 2021

1.5.0

Nov 23, 2020

1.4.0

Aug 19, 2020

1.2.1

Jun 17, 2020

1.2.0

Jan 17, 2020

1.1.2

Jun 29, 2018

1.1.1

Jun 27, 2018

This version

1.1.0

Mar 5, 2018

1.0.0

Feb 15, 2018

0.9.1

Feb 14, 2018

0.9.0

Feb 6, 2018

0.8.1

Dec 22, 2017

0.8.0

Nov 24, 2017

0.7.1

Oct 28, 2017

0.6.1

Oct 20, 2017

0.5.1

Oct 15, 2017

0.5.0

Oct 15, 2017

0.4.0

Oct 9, 2017

0.3.1

Sep 22, 2017

0.3.0

Sep 21, 2017

0.2.1

Sep 21, 2017

0.2.0

Aug 18, 2017

0.1.5

Jul 28, 2017

0.1.4

Jul 28, 2017

0.1.3

Jul 27, 2017

0.1.2

Jun 26, 2017

0.1.1

Jun 23, 2017

0.1.0

Jun 23, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

NanoStat-1.1.0.tar.gz (5.4 kB view hashes)

Uploaded Mar 5, 2018 Source

Hashes for NanoStat-1.1.0.tar.gz

Hashes for NanoStat-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`db1df1365c515a4dc8268dad65e655266a4e556833d598e9666a9bc55159e401`
MD5	`b8f1677168c9f2c7a901a9803ea7ada4`
BLAKE2b-256	`70da341a4168bbdb6e8e56e4ecd30eeda5c509cee4dc6de0279e95812393aa12`