PyO3 bindings and Python interface to lightmotif, a library for platform-accelerated biological motif scanning using position weight matrices.
Project description
🎼🧬 lightmotif
A lightweight platform-accelerated library for biological motif scanning using position weight matrices.
🗺️ Overview
Motif scanning with position weight matrices (also known as position-specific scoring matrices) is a robust method for identifying motifs of fixed length inside a biological sequence. They can be used to identify transcription factor binding sites in DNA, or protease cleavage site in polypeptides. Position weight matrices are often viewed as sequence logos:
The lightmotif
library provides a Python module to run very efficient
searches for a motif encoded in a position weight matrix. The position
scanning combines several techniques to allow high-throughput processing
of sequences:
- Compile-time definition of alphabets and matrix dimensions.
- Sequence symbol encoding for fast table look-ups, as implemented in HMMER[1] or MEME[2]
- Striped sequence matrices to process several positions in parallel, inspired by Michael Farrar[3].
- Vectorized matrix row look-up using
permute
instructions of AVX2.
This is the Python version, there is a Rust crate available as well.
🔧 Installing
lightmotif
can be installed directly from PyPI,
which hosts some pre-built wheels for most mainstream platforms, as well as the
code required to compile from source with Rust:
$ pip install lightmotif
In the event you have to compile the package from source, all the required Rust libraries are vendored in the source distribution, and a Rust compiler will be setup automatically if there is none on the host machine.
💡 Example
The motif interface should be mostly compatible with the
Bio.motifs
module from Biopython. The notable difference is that
the calculate
method of PSSM objects expects a striped sequence instead.
import lightmotif
# Create a count matrix from an iterable of sequences
motif = lightmotif.create(["GTTGACCTTATCAAC", "GTTGATCCAGTCAAC"])
# Create a PSSM with 0.1 pseudocounts and uniform background frequencies
pwm = motif.counts.normalize(0.1)
pssm = pwm.log_odds()
# Encode the target sequence into a striped matrix
seq = "ATGTCCCAACAACGATACCCCGAGCCCATCGCCGTCATCGGCTCGGCATGCAGATTCCCAGGCG"
striped = lightmotif.stripe(seq)
# Compute scores using the fastest backend implementation for the host machine
scores = pssm.calculate(sseq)
⏱️ Benchmarks
Benchmarks use the MX000001
motif from PRODORIC[4], and the
complete genome of an
Escherichia coli K12 strain.
Benchmarks were run on a i7-10710U CPU running @1.10GHz, compiled with --target-cpu=native
.
lightmotif (avx2): 8,065,653 ns/iter (+/- 4,068,613) = 548.8 MiB/s
Bio.motifs: 337,416,172 ns/iter (+/- 24,825,573) = 13.1 MiB/s
MOODS.scan: 179,858,685 ns/iter (+/- 8,296,251) = 24.6 MiB/s
💭 Feedback
⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
📋 Changelog
This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.
⚖️ License
This library is provided under the open-source MIT license.
This project was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.
📚 References
- [1] Eddy, Sean R. ‘Accelerated Profile HMM Searches’. PLOS Computational Biology 7, no. 10 (20 October 2011): e1002195. doi:10.1371/journal.pcbi.1002195.
- [2] Grant, Charles E., Timothy L. Bailey, and William Stafford Noble. ‘FIMO: Scanning for Occurrences of a given Motif’. Bioinformatics 27, no. 7 (1 April 2011): 1017–18. doi:10.1093/bioinformatics/btr064.
- [3] Farrar, Michael. ‘Striped Smith–Waterman Speeds Database Searches Six Times over Other SIMD Implementations’. Bioinformatics 23, no. 2 (15 January 2007): 156–61. doi:10.1093/bioinformatics/btl582.
- [4] Dudek, Christian-Alexander, and Dieter Jahn. ‘PRODORIC: State-of-the-Art Database of Prokaryotic Gene Regulation’. Nucleic Acids Research 50, no. D1 (7 January 2022): D295–302. doi:10.1093/nar/gkab1110.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for lightmotif-0.3.0-pp39-pypy39_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 581d63dd7e9ac206c8e5b06b71610a9ec538d319f734a292f68e450a127e4133 |
|
MD5 | 7f58f05296a9b7f550f9a671647b5ab6 |
|
BLAKE2b-256 | a51d81f307c3dd4b3cd19ad2d70bace7ae4ffcfcbeaa6c74427fa480dc4507e4 |
Hashes for lightmotif-0.3.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9df52991c29a15dac5961d21eac16dfd618c0917cd41ab6a5f8e742b95d3cbda |
|
MD5 | 15ae6f97491d4f85633debd6222488d2 |
|
BLAKE2b-256 | 39cad53de2adab71b02051e8ae1378af3b3e8ec7a758db9ad728c13079c85977 |
Hashes for lightmotif-0.3.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7802b6851b66581b961305cdf365bc363e6eb2b16060fa148eb9bdde85628593 |
|
MD5 | ba73581af447089deceb8cbbbbc65560 |
|
BLAKE2b-256 | 5090f90d1e8c64d180f2569f9be015c12dbd75e9e8754e5958affa2da3104653 |
Hashes for lightmotif-0.3.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 791dd7f3e0146d8a4609e5af05a5c3e111c40044c916d35800b8734b379dfa46 |
|
MD5 | 85a7e113ef892e5565ae44f85132f807 |
|
BLAKE2b-256 | 00a7850c41616df38234b067f06202691535bf4e7aa23ab4a973966e54d0d4ae |
Hashes for lightmotif-0.3.0-pp38-pypy38_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 169c4819946ca83e47b73a160e490af4d4bbf85eb9cd1ecd72b66f2acf2b7dfd |
|
MD5 | 13b06f32103f611d17b0a40ad617a2d1 |
|
BLAKE2b-256 | 65a00a5ff2deb867ba31bd2fbe31bbd48b1e4adbce3cc964276e4493a689f24f |
Hashes for lightmotif-0.3.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5c17bc78c0a3c1419917e14801a5bb07bd3a40e803b5c348c876af72409e3392 |
|
MD5 | 9e1c5d2c7b080efbd98902b55c320dd3 |
|
BLAKE2b-256 | 2ffe789f0183eeb0831606a4cc36ce34f59e70e884494eabf81e98b7931a1019 |
Hashes for lightmotif-0.3.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 552bb5222a70c2749b5dd3b1dd28a467fa199f0a8fd279e1cd769a685d888d08 |
|
MD5 | 3cada4242eb2bd11ae3bdf265d3f65e4 |
|
BLAKE2b-256 | 086e0b783af759d6e414c42a107c5356ec2c47bc0bc821d6cb938018f5dbaba8 |
Hashes for lightmotif-0.3.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb0c5d92dcdae1c3e81677a8f76d0c22d6efe0141d8eabcbbfd00cd42b5d933f |
|
MD5 | 9a659012403da5725e268a643479138f |
|
BLAKE2b-256 | 3fcacbff7903326f5c963800f92f1ffd2994aedf6b6179ff7c41a5f1b617133b |
Hashes for lightmotif-0.3.0-pp37-pypy37_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3a9e360842c3569c0daaf00666d9da2d0b6a3bd69369155f93f3a17af2c5c25 |
|
MD5 | b0f4b02a3c51f579f6074541f08a9fe9 |
|
BLAKE2b-256 | 587a953770e74b3056285c88f83f1b3f2c5842bc2b1bebbd3c38d7e2a8333cb5 |
Hashes for lightmotif-0.3.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d80efa2cbbdc066bbeac98d12f592644cef47a3c91cc360c74b81b3b4b244405 |
|
MD5 | 9f0c77387ef3e207d9ad2fe61ee0b56f |
|
BLAKE2b-256 | 9d41a0a476877f56b07d671fe6d471e79c0e607d9ae1c671e5d08272d26d65ea |
Hashes for lightmotif-0.3.0-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31d54152f4f6bde4dd291c19bbd70b4a90222ebdb5b4e35a0fac5f75b9bef6a6 |
|
MD5 | 9d0c3345a65b5eb0d4d717d8e7926ed2 |
|
BLAKE2b-256 | 9bc267ce0db1cf864fc7f65324f1df2cd2341d8b80cc2a047370b77eff915451 |
Hashes for lightmotif-0.3.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71807af0e762423371b8e9fa95838379141540e3216c573abdfc537c3f35453f |
|
MD5 | 2c44405ecac3f787201be2fd012efadb |
|
BLAKE2b-256 | 34c3935cf79345aeb085d9a6a49e3b1380bc5d13c495908b167081972e2fb7e8 |
Hashes for lightmotif-0.3.0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 808dba3b834b93460d7db7a692c3991f4b188ede2f1442864445535fc088e208 |
|
MD5 | 2df70adcca781f4e328ce8db6475a72f |
|
BLAKE2b-256 | d4d3fc1eb638ffdd316ebb5d12da2fa346b492e2e3637a851b03fcf4cc125e2d |
Hashes for lightmotif-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd5c136335d261820b80be310982a92fc55752cfaf6ca2d2a761426bc3f37e3d |
|
MD5 | 7b749fb9ce46f86e9addc8580bf14cc2 |
|
BLAKE2b-256 | 0372d95dc04334ca4074b37e00a910fecb58e78c195417ed72174e5a2795d364 |
Hashes for lightmotif-0.3.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c7d49a290b993373d9dc3d77e8891b71a1bc1e2419268e1733de11c38237e28 |
|
MD5 | a0295fe65ce332e13ca4be870d30b863 |
|
BLAKE2b-256 | 1ffd9e090bd511c1eb87b343b7ef837c2bd1dce423b0cf5f47e5bbead29b09ee |
Hashes for lightmotif-0.3.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 01e14ea3d1e6ea01a116276c1c26ada1a7c976e26bae2f398267c09bc4e920f6 |
|
MD5 | 1ab785645c9a9b32f75d3c331b20f467 |
|
BLAKE2b-256 | 4322a2cc54ab36db6f1567ff269b7b07b338b563de226c4e12d3b2e7de815dd5 |
Hashes for lightmotif-0.3.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 56e1ea8b708ae552530e5ff42c6082b965c39016da093bcf6c154da1ec56a62a |
|
MD5 | db0ca940f0fdb28914fb2002bd825d59 |
|
BLAKE2b-256 | 0c69f4da1c91c7449fba67220400d6230293dc4ed50d8d565daaeda16947bce1 |
Hashes for lightmotif-0.3.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | afd800d42d0254dcf34daf58d812c67aa9ddddaa739bc1ae5a9a3327511ff8b2 |
|
MD5 | eddaed974e8969883fcca1fb0beac305 |
|
BLAKE2b-256 | 1f284000dc50ea66d73c3e968a02bf3040f0f19fd66b9feabf5f2e347c74ef73 |
Hashes for lightmotif-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27d69fad3a5c55587bf796bd2641dfdeac9f5e7c612b6016cfa29a635c5c6e0e |
|
MD5 | 690f78eddfd6b61f1ee88037e442bea0 |
|
BLAKE2b-256 | a0aea06cc4b1994d1e37f117802aadb44ead71df99c68f89004dc923ba959d5a |
Hashes for lightmotif-0.3.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12945ef72cce6e144a7c8a41ec70027a05bff2ec2383bfda1bf92fce78f666cc |
|
MD5 | 38414b17356bc2a6a4472b39172a10ab |
|
BLAKE2b-256 | ac8494dce72bedbbe80d03be0d8299cd894d5d88ed124d08283ec881c55c23d4 |
Hashes for lightmotif-0.3.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1f609ab56323b2da2aaaff7a1312726f20640e90f2f699feef975979b5bf6a6 |
|
MD5 | 75c2ff5f5051b81856e365aec988c301 |
|
BLAKE2b-256 | 4747e698d230ad3fa736aa2fb5c745b33995098331b141a858efc8602ecce877 |
Hashes for lightmotif-0.3.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59625ae7d46e60cec44cddc021b1c4d7f95bead7481375fa68088ab3747d684f |
|
MD5 | 8a25bf630bfecca70d7811e9c2b127b1 |
|
BLAKE2b-256 | a9398bce817ada9f61be5a1343268eb470c022093bed307eab76ced217cd91bd |
Hashes for lightmotif-0.3.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10e190982295b0968311858b33830bd092c732d3a5eb43cdc207b95524ef4d69 |
|
MD5 | 5b56f92149eb2d7edddd66e60acdc6e7 |
|
BLAKE2b-256 | e3af779d2441d043724fa984a493b633d0df5d553ba77253178c99c78a540769 |
Hashes for lightmotif-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd4fc3714bd73098aab7bb4bd0c030bb33ce23006549cbb7ad0cb6be2112f6a3 |
|
MD5 | 676a53778155e26b221b9a4df9de1448 |
|
BLAKE2b-256 | 322706f50ae530cb304585c2544c7aa8f37787ea3c5324fd89b7097b9d852151 |
Hashes for lightmotif-0.3.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10ff5c24f7a1ab2e20abf1010cdbd485c920a70c70c2eefd3a491c735979335c |
|
MD5 | f1fd06436c318c82f2eca0e62a9b3174 |
|
BLAKE2b-256 | 520be98c927a2a4b933a0c46629e5f68b6c83f8d774139b605aa1cba4a1a1c08 |
Hashes for lightmotif-0.3.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b83d72468572efab83a8f641777a5451dfb1ca3d77cc75033748d8d8579391fa |
|
MD5 | eda98700e053cc1a5a1aa025872db36d |
|
BLAKE2b-256 | e492f2244e0f2bfca2bb490fbf8adbd3cb5c7046f30ce521c27882f8b78a92cb |
Hashes for lightmotif-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35f2ac46a7bc163d50c3dcf87a0c86a60320f8c776ee2f602e273dfe8ee4dab2 |
|
MD5 | 6fbaa6c719fb570eb7455e82fc9cb0ac |
|
BLAKE2b-256 | a1a4a685a7bb2ec2710ea8ce926ca5c21e591d2388089550bc8146ba9d952a34 |
Hashes for lightmotif-0.3.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94e6da702fe674ac55608a097bb1ebc77d9068fea3d24148dea874bc0daa3715 |
|
MD5 | 96c6baa087813d9b48371a23cdc2d293 |
|
BLAKE2b-256 | 33f9049e3f9eb459c4d782ae6d8747ffaf007e911cbad9048803f307136fd1f2 |
Hashes for lightmotif-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d13de98795116cccac0293e008b59918d05fa6368dfdd67dee564f37cd3cc16f |
|
MD5 | 0feb10008be9965881e86436f05a3465 |
|
BLAKE2b-256 | 179774c983792e91e06c043e312d8ad87f040de0d8e9ed0da4e73d5f73a966fc |
Hashes for lightmotif-0.3.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | da0138cb212863e527f66d220d03c294917955e5cb67491049ecd8f98a64aac9 |
|
MD5 | 21b4cbc0eaa4771007948cfe56a97933 |
|
BLAKE2b-256 | 1589326e6125db8a17140b9197d7d48e3df57c0f2a831b4778096b71cfaa9e13 |
Hashes for lightmotif-0.3.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e98c387873cf12b9e8119bc2d37fd076c9b1a83a957085face9e3ac070f70389 |
|
MD5 | f89adb4a135baf2797fbad3b7427519c |
|
BLAKE2b-256 | 694a6fff3e25a2160ac1de3cdc6f1e3c556ab936abef45af9cc49e2262435f86 |
Hashes for lightmotif-0.3.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfe186469293de367b15d18ee796db4bd6f32f8c9aa2654def4291527513e99e |
|
MD5 | bd68993f58a61ba7d12f855c052d1459 |
|
BLAKE2b-256 | 28e47493045e0b65fdcfc8bd539fd9d1d52124e829bc662da0fad8397dcdaf1d |
Hashes for lightmotif-0.3.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37b632021c6c3796fd175d2fd89e0102cea81b637d5013c857add075808caca8 |
|
MD5 | 2c41ac1b08a0ed1cb769ecda0a1fca0b |
|
BLAKE2b-256 | c2fb5bf1ecd5056bd414ed5b514d107e6ba08e51010342fdac093513f80b5560 |
Hashes for lightmotif-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 257cffe5021aac24d29b1452dd0edbc8a29758c6610613f845aacfcfb3080c9d |
|
MD5 | 7422d26cda7681b819e7164d7afb1de5 |
|
BLAKE2b-256 | 010e8426cc05f8df5bfff7b44f573f62de27da08555483e1c1c8cdfc231777b8 |
Hashes for lightmotif-0.3.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 162cf8cf43dd9cc98fa4e4392d81ed93277aa15aed03c53e17c84c9ab8d08c68 |
|
MD5 | 4db3822ad6ee16bbbae5c7607797dc4c |
|
BLAKE2b-256 | 44a4a1f46305fb3c47108d12b11a01144c3bfbabc74557c8b5a2ef736fbbb7b7 |
Hashes for lightmotif-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1371e28f29e4097ab282539decae00ed7e68281d9c796b6460554a0548a2258b |
|
MD5 | d490277592c0f0cc36cae8e18db11289 |
|
BLAKE2b-256 | bfd218787e33925799b8089d9aa22bee2ba5f6941a0b2078dd6c329e145573fe |