Skip to main content

Python interface to Prodigal, an ORF finder for genomes, progenomes and metagenomes.

Project description

🔥 Pyrodigal

Python interface to Prodigal, an ORF finder for genomes, progenomes and metagenomes.

TravisCI Coverage License PyPI Wheel Python Versions Python Implementations Source GitHub issues Changelog

🗺️ Overview

📋 Features

The library now features everything needed to run Prodigal in metagenomic mode. It does not yet support single mode, which requires more configuration from the user but offers more flexibility.

Roadmap:

  • Metagenomic mode
  • Thread safety
  • Single mode
  • External training file support (-t flag)
  • Region masking (-m flag)

📋 Memory

Contrary to the Prodigal command line, Pyrodigal attempts to be more conservative about memory usage. This means that most of the allocations will be lazy, and that some functions will reallocate their results to exact-sized arrays when it's possible. This leads to Pyrodigal using about 30% less memory, but with some more overhead

🧶 Thread-safety

pyrodigal.Pyrodigal instances are not thread-safe: concurrent find_genes calls will overwrite the internal memory used for dynamic programming and could lead to unexpected crashes. A solution to process sequences in parallel is to use a consumer/worker pattern, and have on Pyrodigal instance in each worker. Using a pool spawning Pyrodigal instances on the fly is also fine, but prevents recycling internal buffers:

with multiprocessing.pool.ThreadPool() as pool:
    pool.map(lambda s: Pyrodigal(meta=True).find_genes(s), sequences)

💡 Example

Using Biopython, load a sequence from a GenBank file, use Prodigal to find all genes it contains, and print the proteins in FASTA format:

record = Bio.SeqIO.read("sequence.fa", "genbank")
p = pyrodigal.Pyrodigal(meta=True)

for i, gene in enumerate(p.find_genes(str(record.seq))):
    print(f"> {record.id}_{i+1}")
    print(textwrap.fill(record.translate()))

📜 License

This library, like the original Prodigal software, is provided under the GNU General Public License v3.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrodigal-0.1.0.tar.gz (632.5 kB view hashes)

Uploaded Source

Built Distribution

pyrodigal-0.1.0-cp38-cp38-manylinux2010_x86_64.whl (1.0 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page