DNA input and output library for Python and Cython. Includes reader and writer for FASTA and FASTQ files, support for samtools faidx files, and generators for solid and gapped q-grams (k-mers).
Project description
Dinopy - DNA input and output for python
Dinopy’s goal is to make files containing biological sequences easily and efficiently accessible for python programmers, allowing them to focus on their application instead of file-io.
#!python import dinopy fq_reader = dinopy.FastqReader("reads.fastq") for sequence, name, quality in fq_reader.reads(quality_values=True): if some_function(quality): analyze(seq)
Features
Easy to use reader and writer for FASTA-, FASTQ-, and SAM-files.
Specifiable data type and representation for return values (bytes, bytearrays, strings and integers see dtype for more information).
Works directly on gzipped files.
Iterators for q-grams of a sequence (also allowing shaped q-grams).
(Reverse) complement.
Chromosome selection from FASTA files.
Implemented in Cython for additional speedup.
Getting Started
Installation
Dinopy can be installed with pip:
$ pip install dinopy
or with conda:
$ conda install -c bioconda dinopy
Additionally, dinopy can be downloaded from Bitbucket and compiled using its setup.py:
Download source code from bitbucket.
Install globally:
$ python setup.py install
or only for the current user:
$ python setup.py install --user
Use dinopy:
$ python >>> import dinopy
Installation requirements
python >= 3.3
numpy >= 1.7
C and C++ compilers, for example from build-essentials (Linux) or Xcode (OSX)
Optional: cython >= 0.20
We recommend using anaconda and the bioconda channel.
$ conda config --add channels r $ conda config --add channels bioconda $ conda create -n dinoenv dinopy
Platform support
Dinopy has been tested on Ubuntu, Arch Linux and OS X (Yosemite and El Capitan).
We do not officially support Windows - dinopy will probably work, but there might be problems due to different linebreak styles; we assume \n as separator but the probability to encounter files with \r\n as line-separator might be higher on Windows.
Features in development
BAM-writer / reader
Planned features
GFF3 parser
Bisulfite arrays
quality-trimming for FASTQ parser
Contact
If you want to report a bug or want to suggest a new feature, feel free to do so over at bitbucket.
Email:
Henning Timm: henning.timm@tu-dortmund.de
Till Hartmann: till.hartmann@tu-dortmund.de
License
Dinopy is Open Source and licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.