PyO3 bindings and Python interface to lightmotif, a library for platform-accelerated biological motif scanning using position weight matrices.
Project description
🎼🧬 lightmotif
A lightweight platform-accelerated library for biological motif scanning using position weight matrices.
🗺️ Overview
Motif scanning with position weight matrices (also known as position-specific scoring matrices) is a robust method for identifying motifs of fixed length inside a biological sequence. They can be used to identify transcription factor binding sites in DNA, or protease cleavage site in polypeptides. Position weight matrices are often viewed as sequence logos:
The lightmotif
library provides a Python module to run very efficient
searches for a motif encoded in a position weight matrix. The position
scanning combines several techniques to allow high-throughput processing
of sequences:
- Compile-time definition of alphabets and matrix dimensions.
- Sequence symbol encoding for fast table look-ups, as implemented in HMMER[1] or MEME[2]
- Striped sequence matrices to process several positions in parallel, inspired by Michael Farrar[3].
- Vectorized matrix row look-up using
permute
instructions of AVX2.
🔧 Installing
lightmotif
can be installed directly from PyPI,
which hosts some pre-built wheels for most mainstream platforms, as well as the
code required to compile from source with Rust:
$ pip install lightmotif
In the event you have to compile the package from source, all the required Rust libraries are vendored in the source distribution, and a Rust compiler will be setup automatically if there is none on the host machine.
💡 Example
The motif interface should be mostly compatible with the
Bio.motifs
module from Biopython. The notable difference is that
the calculate
method of PSSM objects expects a striped sequence instead.
import lightmotif
# Create a count matrix from an iterable of sequences
motif = lightmotif.create(["GTTGACCTTATCAAC", "GTTGATCCAGTCAAC"])
# Create a PSSM with 0.1 pseudocounts and uniform background frequencies
pwm = motif.counts.normalize(0.1)
pssm = pwm.log_odds()
# Encode the target sequence into a striped matrix
seq = "ATGTCCCAACAACGATACCCCGAGCCCATCGCCGTCATCGGCTCGGCATGCAGATTCCCAGGCG"
encoded = lightmotif.EncodedSequence(seq)
striped = encoded.stripe()
# Compute scores using the fastest backend implementation for the host machine
scores = pssm.calculate(sseq)
⏱️ Benchmarks
Benchmarks use the MX000001
motif from PRODORIC[4], and the
complete genome of an
Escherichia coli K12 strain.
Benchmarks were run on a i7-10710U CPU running @1.10GHz, compiled with --target-cpu=native
.
lightmotif (avx2): 26,528,740 ns/iter (+/- 14,817,953) = 166.9 MiB/s
lightmotif (generic): 654,599,309 ns/iter (+/- 81,292,868) = 6.8 MiB/s
Bio.motifs: 526,309,061 ns/iter (+/- 45,603,991) = 8.4 MiB/s
💭 Feedback
⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.
📋 Changelog
This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.
⚖️ License
This library is provided under the open-source MIT license.
This project was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.
📚 References
- [1] Eddy, Sean R. ‘Accelerated Profile HMM Searches’. PLOS Computational Biology 7, no. 10 (20 October 2011): e1002195. doi:10.1371/journal.pcbi.1002195.
- [2] Grant, Charles E., Timothy L. Bailey, and William Stafford Noble. ‘FIMO: Scanning for Occurrences of a given Motif’. Bioinformatics 27, no. 7 (1 April 2011): 1017–18. doi:10.1093/bioinformatics/btr064.
- [3] Farrar, Michael. ‘Striped Smith–Waterman Speeds Database Searches Six Times over Other SIMD Implementations’. Bioinformatics 23, no. 2 (15 January 2007): 156–61. doi:10.1093/bioinformatics/btl582.
- [4] Dudek, Christian-Alexander, and Dieter Jahn. ‘PRODORIC: State-of-the-Art Database of Prokaryotic Gene Regulation’. Nucleic Acids Research 50, no. D1 (7 January 2022): D295–302. doi:10.1093/nar/gkab1110.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for lightmotif-0.1.1-pp39-pypy39_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb6868fa86ace914e9ff7fa9f9b747101f2c3b472b44acf04e0f9a4872ce8783 |
|
MD5 | e0773614d5f7f557ec5d1c29519e129a |
|
BLAKE2b-256 | 7e134b44daa3d40743f4c0ecbb155e1859579757ebb71152998dfe6e13a75ace |
Hashes for lightmotif-0.1.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41c9e8b51d4f1643222a50702fc6190587deb9c24e43ff650e2bd68ce3f11642 |
|
MD5 | cfec27561e0ee9ca86c1ed4d9f059549 |
|
BLAKE2b-256 | eaddeae2c40074fb2ec7b8d347a0c3f33b552107890e2a862d06b96ba19ce942 |
Hashes for lightmotif-0.1.1-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | be8f668e0db020306cf1b97801bce6c8573b10e48fdac1d92819635973809fd8 |
|
MD5 | 38946629e1043127c091437fe9bbf248 |
|
BLAKE2b-256 | f292b4211a9a08df7c6bcbfc191f805d7b65796f785da8000c521e5e57a98a06 |
Hashes for lightmotif-0.1.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b77d102535f20edb09ad9449612dc9fa8c3eaa876028fb94957837cc3378929 |
|
MD5 | 3c10eb088d31e44e16d0229326b8e5ee |
|
BLAKE2b-256 | ac5d4d8694c148a9ca7bfb4f7ec6e2bc29a5e253e037d4214b800e4a34953857 |
Hashes for lightmotif-0.1.1-pp38-pypy38_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25d86e6f7f76c4c1adeb723e58db9596e67841b39c386fb4210708d666106538 |
|
MD5 | 3844d787c4b95de6e1656464f4fe94d8 |
|
BLAKE2b-256 | 353f9f609798cb3c81169315081b297bc4d5e8bb7003fb80243de90e38900b4b |
Hashes for lightmotif-0.1.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bce034c62f7276fca33baa36aaefa4fe856a4d27fea38293d4be89ca667d706d |
|
MD5 | 49a4622b4d9f2a15d781d40661f27eaf |
|
BLAKE2b-256 | a110ee25cecb2fdb908b47095d55da6ef070edb5ffe760c4abbd7441cce97d13 |
Hashes for lightmotif-0.1.1-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a29c959f380f0def7bfe45c23725ffa96942b1af740f768ec7ffcee3bfaee17b |
|
MD5 | 7b380d49d6dd38c7f6996c958d0a6698 |
|
BLAKE2b-256 | 2ee66d9ef87e0a06a78ecd074396e58157b20477725a0cd220c6eb706a22a170 |
Hashes for lightmotif-0.1.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6341f4c2145ad8a81c44c3e354c0b67ab792005cbb8e466beb74d13fd40770a9 |
|
MD5 | bdcd43779b38b2ad9e1bcd7cc84c7910 |
|
BLAKE2b-256 | 51b6cf5d5370602fdb47c17724adcb60200b90591cc86cf7e10b2d341548673c |
Hashes for lightmotif-0.1.1-pp37-pypy37_pp73-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e981d92080af082079083a594b4f5d2a93c0e3ebbe4b6bef4cd242e4468fa04 |
|
MD5 | c505294536ed17f4fcbc6fdef94d303f |
|
BLAKE2b-256 | 6a498c382281d7a0dc49ceac67a5829c757ae14fddb2f16275c89c8c62420d02 |
Hashes for lightmotif-0.1.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a05edf3238d33806964b39a6bdc64597f433b70be5f0ab43b3c55e3fd2ec3ad |
|
MD5 | 26a389d43ebd99af79af2dbed850000b |
|
BLAKE2b-256 | 84924d93158b0037355293401407ca0570da2a4de1fe201fab5e29d8b2570fe8 |
Hashes for lightmotif-0.1.1-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 651cbe151f51116d258714c39453377392862687ccf3e937dd79c580b3dbf6d4 |
|
MD5 | c22d4b6b3bcd7b51370366f69c169d1c |
|
BLAKE2b-256 | 6b420fbefb5c95a9ecc7c474eddd664ca07ee05247a1dea919069cd6f3daa4d3 |
Hashes for lightmotif-0.1.1-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dec9b579ba3cb520c1e9b853037b8eb76a3a2c2b032c25387be388f07723bbbd |
|
MD5 | ff4e2bf821d7823aeb1b5ab84bb29416 |
|
BLAKE2b-256 | 36ff183a2634f9e5f131c1fceda44a76bb35a1b0ec459f4920ab086d983005c5 |
Hashes for lightmotif-0.1.1-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06b330c178d8940d4d41228b42cc186dd28f2540432b5c9aa7f71c0ff48eb3a0 |
|
MD5 | 206be59c61311f5809c16d8d17f9b2d6 |
|
BLAKE2b-256 | de1835c86b09bb87a19639b663247a7009b229fc3b91162c5ddd6fb837106059 |
Hashes for lightmotif-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44370c084659571d03d8e95fda1e93ac05d56cd590e0d420e367d612f8e0e377 |
|
MD5 | 18b43f80f6ff8577b114a8294e8af83a |
|
BLAKE2b-256 | f800ea5b91db64c4e727115526f6fcc51fb4f16ed7dec2b4e2d8e63297bd1524 |
Hashes for lightmotif-0.1.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c02ab66559659d1cdcc4fd14d5f6f3712ac4b004c3b2ccca90b20e024a0f38b8 |
|
MD5 | 984940bccd15c93dfb9baebaeb21e988 |
|
BLAKE2b-256 | 85541ed25e6e53c40da55b2de7ef50ae4f0eabee38fa3d409d985ea9d069b8e6 |
Hashes for lightmotif-0.1.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10fc4f4b80c8389714ea341f25765cb56dfd375a3134ae6576641dcfbbf593a6 |
|
MD5 | afc38a1cff01d25c61e9993c3c823389 |
|
BLAKE2b-256 | 765ad37533ca9ee3c3cc13a0124d648d88a24bcfe32eccc655d2f9436b4e7717 |
Hashes for lightmotif-0.1.1-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63ca72e41481c6a6c66eece5f0e02c8d7cd355ecef257627f34d5eb430fc83e1 |
|
MD5 | 859ee903ac10331cfcd8876858980ffa |
|
BLAKE2b-256 | 97febafd0bf94917029a72084b0d71669f2fbc8e891841dc0379345407ab8c16 |
Hashes for lightmotif-0.1.1-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ff6f263836bab1b83686c3345c13bd5d8d6b0309da3c431d8ee702c27574136f |
|
MD5 | 442f0f6afd3b209a097189cd53df22ef |
|
BLAKE2b-256 | be73ab3cd9106634976a433478603247737e05c028d01ff56f4cbfed242ffd88 |
Hashes for lightmotif-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0958627ce0dbec2778cf396d2a4df6187a604aba8fd0e9a9f489eff6a82a8ae5 |
|
MD5 | 2c0a0358be8ab41d9322e7021a2db9f9 |
|
BLAKE2b-256 | 2cf2866641ac9c0a32d25353d61bb27ca6116e9dbef520b68fe8928c193e9c9d |
Hashes for lightmotif-0.1.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb66dc40c6a7907e358c0223cf1b44f28d2b9e93b8d82da77531892fdc61f32c |
|
MD5 | aebb6c70d66b94d11a3a4b972487ba6a |
|
BLAKE2b-256 | 97cfffc661fa04129bdebcef3aeca05af483a5afcbe84c2685165a93267ace98 |
Hashes for lightmotif-0.1.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b98b34d038c046aa18e85b15955ddcff42598ae4268e8a3ae2064fc68bfdbf58 |
|
MD5 | 0042e72996ac7e39a8e0f93d7102f706 |
|
BLAKE2b-256 | aef15b79e554915863e8b0aada47d01201aa7616570418b393f40f9140481bcc |
Hashes for lightmotif-0.1.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 153c93ef54329b471ecd8e3b2be81ac7482abfb7a228ca1b3617ce224e81a96f |
|
MD5 | 01f43d5fdf74ef770cbd291a9424db61 |
|
BLAKE2b-256 | c524a4f9aa8b96132d25135961a158523a63e656a491c9ed252c526cdf511948 |
Hashes for lightmotif-0.1.1-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ebd82f0f4d3eb03e8957e46d0a9a93cd23fefcae0e3dd9b7781553afdf42719 |
|
MD5 | f6991bf7b04a3caa606b9f725d0527c5 |
|
BLAKE2b-256 | b3f93e9ada372c6adf43b2b083498793d517346745f9a4493960d171114206e4 |
Hashes for lightmotif-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f8b66011d895bb11fec49dabab4999206305ae789084e28f6a32bbabd25f0c1a |
|
MD5 | 8bacb372d3c45f13ee26a09ff9e13ed5 |
|
BLAKE2b-256 | fcb502a981d10db99fcc5dc599436271c289e4c47e129088e1a3bb1d3ee8fe96 |
Hashes for lightmotif-0.1.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ca7a97fcea0420e5e9457a3c20dca001cef1bd4f003611b92a5ee734791a00af |
|
MD5 | 8ae12a9da32cc5566859832406aa8c4b |
|
BLAKE2b-256 | faa8e170df82988e869b193052e7472db18bb246d4ae5ea9bc02722c045d08fd |
Hashes for lightmotif-0.1.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2398049c3b9d5011f50ab0e46f4ef034924b8d56876dcab35a6f5572bbbae94a |
|
MD5 | 4777eadfa7fdbe6a032ddb3aeed9b24c |
|
BLAKE2b-256 | a3d8bf804ce47c8aeeef4ac6629bd84c811e4e5cd7a93a818d41f4db7fa5aae7 |
Hashes for lightmotif-0.1.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 867e727b5940ccac5744c46f4a0f4eca56f8fa2e807dffe8810650ca13ce0b74 |
|
MD5 | 123e40c6501bb88c0fe954d9258a0ae4 |
|
BLAKE2b-256 | c59d024576336354deaf2ee78bf53364a09f5647cf144f457c5563634c14e735 |
Hashes for lightmotif-0.1.1-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ef80c4ec2b7959f34f8c85a4a93f117d8bedf80e40940d2987fb7169d063d2c2 |
|
MD5 | 02097d84099f0bd76dd884c9ad079cf0 |
|
BLAKE2b-256 | 2e8d10858161d06eae18e47d36b61c709ac9a4810052a0cee244272fc5831ce1 |
Hashes for lightmotif-0.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fdf39b2c4fe08446e8946b29136dcb1fc7f7e770837605e7a93caf52b6c0245f |
|
MD5 | ac0551cd1b383fb25252431a6762c986 |
|
BLAKE2b-256 | bf3e9258c664addba439e60dcfe6f70a321ba508de0a0887d32eaa6a976774b8 |
Hashes for lightmotif-0.1.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 545c590ceacced0a2aa0993b7644a80d171891c5105ea75e7db4e2efc677ef2c |
|
MD5 | 54578961edab1d7a19f3471663ffe904 |
|
BLAKE2b-256 | 397af33b821cfe5ce552be0be0d196d597e31aa80c35635a0373f24a373a879c |
Hashes for lightmotif-0.1.1-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d6563c467918f652d248b07549dd53f8d4cfa23ab88930b96fa965d734384d10 |
|
MD5 | 2c91f7e8cc818189a1e9479d0dceedae |
|
BLAKE2b-256 | b34d8ef91b782449877f480a84e73ac40aba4ced36d334981ed03048d0b0c527 |
Hashes for lightmotif-0.1.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5799cb6dae44d87641d4b9691b38e8556aa0104812a7d94b988cb45d05c85098 |
|
MD5 | 5feaf40556956c5c4fae2db44638a04f |
|
BLAKE2b-256 | 8519f8187d7b3689c85bfc565c83ce5a830c7399587992a72ccca2602399dc03 |
Hashes for lightmotif-0.1.1-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f1e6767c0cc42a100fcbec53d11cd2b267da28e47fa94ce08baf8ed6d04aac0 |
|
MD5 | 14d933741cd35190b799179ad8d9ed80 |
|
BLAKE2b-256 | 1ea7788ddecc462d29af948e94da2e8cf7b9b2615b69a81e76796e774918056b |
Hashes for lightmotif-0.1.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e1238d72cf416df05bbf5b5928cc80668639cc72d82a96bfce43813cca4c77f |
|
MD5 | 7e6bd6e552f6896601cb2cdf00dee7ff |
|
BLAKE2b-256 | fe022dc1f36958846b4e7bf2e68d060c16a4ba4514e9989c5142ba907c730b66 |
Hashes for lightmotif-0.1.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47a32a2b0a876c4ca0705219f4cb16e64746088df2f1fea77e3f63628224d310 |
|
MD5 | 122df7e943d315ac9c89dc975f648473 |
|
BLAKE2b-256 | 869400fa7e7f659a62d80898c7fcfd92bd1c280f8f7905bbcb0f45dafeee8660 |
Hashes for lightmotif-0.1.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9eebdbf95990e6e4c5395cfe55d5eb827ac3bed16b608a5bbed0b28aecd3e1e1 |
|
MD5 | a50f766d592264bd89ed0adf45d45fe7 |
|
BLAKE2b-256 | c6b8e225b941c6bff82f8506ddf0d778319c7d6a2b71f7c9dd0d4d6b4d1c977a |