Skip to main content

A comprehensive library for computational molecular biology

Project description

Biotite at PyPI Python version Test status The Biotite Project

Biotite project

Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:

  • Searching and fetching data from biological databases

  • Reading and writing popular sequence/structure file formats

  • Analyzing and editing sequence/structure data

  • Visualizing sequence/structure data

  • Interfacing external applications for further analysis

Biotite internally stores most of the data as NumPy ndarray objects, enabling

  • fast C-accelerated analysis,

  • intuitive usability through NumPy-like indexing syntax,

  • extensibility through direct access of the internal NumPy arrays.

As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.

If you use Biotite in a scientific publication, please cite:

Kunzmann, P. & Hamacher, K. BMC Bioinformatics (2018) 19:346.

Installation

Biotite requires the following packages:

  • numpy

  • requests

  • msgpack

  • networkx

Some functions require some extra packages:

  • mdtraj - Required for trajetory file I/O operations.

  • matplotlib - Required for plotting purposes.

Biotite can be installed via Conda

$ conda install -c conda-forge biotite

… or pip

$ pip install biotite

Usage

Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:

import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez

# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
    uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
    db_name="protein", ret_type="fasta"
)

# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()

# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
    avidin_seq, streptavidin_seq, matrix,
    gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA

TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT

DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ

More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.

Contribution

Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biotite-0.39.0.tar.gz (33.2 MB view hashes)

Uploaded Source

Built Distributions

biotite-0.39.0-cp312-cp312-win_amd64.whl (36.5 MB view hashes)

Uploaded CPython 3.12 Windows x86-64

biotite-0.39.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (53.9 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

biotite-0.39.0-cp312-cp312-macosx_11_0_arm64.whl (36.8 MB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

biotite-0.39.0-cp312-cp312-macosx_10_9_x86_64.whl (37.2 MB view hashes)

Uploaded CPython 3.12 macOS 10.9+ x86-64

biotite-0.39.0-cp311-cp311-win_amd64.whl (36.5 MB view hashes)

Uploaded CPython 3.11 Windows x86-64

biotite-0.39.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (54.3 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

biotite-0.39.0-cp311-cp311-macosx_11_0_arm64.whl (36.9 MB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

biotite-0.39.0-cp311-cp311-macosx_10_9_x86_64.whl (37.3 MB view hashes)

Uploaded CPython 3.11 macOS 10.9+ x86-64

biotite-0.39.0-cp310-cp310-win_amd64.whl (36.5 MB view hashes)

Uploaded CPython 3.10 Windows x86-64

biotite-0.39.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (52.8 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

biotite-0.39.0-cp310-cp310-macosx_11_0_arm64.whl (36.9 MB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

biotite-0.39.0-cp310-cp310-macosx_10_9_x86_64.whl (37.3 MB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page