Library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Re-implementation of OpenFst in Rust.
Project description
Rustfst
Rust
Python
This repo contains a Rust implementation of Weighted Finite States Transducers. Along with a Python binding.
Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition's input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.
FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.
References
Implementation heavily inspired from Mehryar Mohri's, Cyril Allauzen's and Michael Riley's work :
- Weighted automata algorithms
- The design principles of a weighted finite-state transducer library
- OpenFst: A general and efficient weighted finite-state transducer library
- Weighted finite-state transducers in speech recognition
Example
use anyhow::Result;
use rustfst::prelude::*;
use rustfst::algorithms::determinize::{DeterminizeType, determinize};
use rustfst::algorithms::rm_epsilon::rm_epsilon;
fn main() -> Result<()> {
// Creates a empty wFST
let mut fst = VectorFst::<TropicalWeight>::new();
// Add some states
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();
// Set s0 as the start state
fst.set_start(s0)?;
// Add a transition from s0 to s1
fst.add_tr(s0, Tr::new(3, 5, 10.0, s1))?;
// Add a transition from s0 to s2
fst.add_tr(s0, Tr::new(5, 7, 18.0, s2))?;
// Set s1 and s2 as final states
fst.set_final(s1, 31.0)?;
fst.set_final(s2, 45.0)?;
// Iter over all the paths in the wFST
for p in fst.paths_iter() {
println!("{:?}", p);
}
// A lot of operations are available to modify/optimize the FST.
// Here are a few examples :
// - Remove useless states.
connect(&mut fst)?;
// - Optimize the FST by merging states with the same behaviour.
minimize(&mut fst)?;
// - Copy all the input labels in the output.
project(&mut fst, ProjectType::ProjectInput);
// - Remove epsilon transitions.
rm_epsilon(&mut fst)?;
// - Compute an equivalent FST but deterministic.
fst = determinize(&fst)?;
Ok(())
}
Benchmark with OpenFST
I did a benchmark some time ago on almost every linear fst algorithm and compared the results with OpenFst
. You can find the results here :
Spoiler alert: Rustfst
is faster on all those algorithms 😅
Documentation
The documentation of the last released version is available here : https://docs.rs/rustfst
Release process
- Use the script
update_version.sh
to update the version of every package. - Push
- Push a new tag with the prefix
rustfst-v
Example :
./update_version.sh 0.9.1-alpha.6
git commit -am "Release 0.9.1-alpha.6"
git push
git tag -a rustfst-v0.9.1-alpha.6 -m "Release rustfst 0.9.1-alpha.6"
git push --tags
Optionally, if this is a major release, create a GitHub release in the UI.
Projects contained in this repository
This repository contains two main projects:
rustfst
is the Rust re-implementation.rustfst-python
is the python binding ofrustfst
.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for rustfst_python-0.11.4-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1cd5347122d435a0594d9ccab23e43c21fa9f370edcb8f767c1b17a03b7df55c |
|
MD5 | b50bb3be02dd9172726d51af4557eff9 |
|
BLAKE2b-256 | b7f3c167c418ae7e5a2171747cb1a234a239b263739a7ad987c43168e24dba64 |
Hashes for rustfst_python-0.11.4-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a544d1c0dbec92a8be2c7aa931ea0aa42fe14c1eb38f8102c31f80590af4bc0 |
|
MD5 | 249a966ff110eb76b9db30b17b89c48e |
|
BLAKE2b-256 | 3d22c8fc3717ed939cffe2688ae61f793fa0035cb9a6b186e9638323a05ba8b2 |
Hashes for rustfst_python-0.11.4-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15ce1726775933ff23825f25b743949b5927c0c9941cfb2ec928aff62e4aadd0 |
|
MD5 | 11a8e08b610ce2c09594aa926e64df46 |
|
BLAKE2b-256 | 1a4ea4d840d25bf131c40d54685efff83c33b8246c1dc78aee88c9f19e9aed23 |
Hashes for rustfst_python-0.11.4-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6733e83b3cf1f28491b49fea3975c3fdbb345cc671293a6becb3af8e3e532027 |
|
MD5 | 7dec2f6da121b50aab4ebc3f429fdf16 |
|
BLAKE2b-256 | 59b551a710b7a5e95156e71465792587904ff6a45ab76558c9bbd98809b09e23 |
Hashes for rustfst_python-0.11.4-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81fdb24071f2383afea89cb62d62cda14cb88e30d29f695a97ff2e2fdf518a19 |
|
MD5 | 42f67c1ab4152714cf2864f701d8740d |
|
BLAKE2b-256 | bcffbe709f2d1178fc91d52beef7fa18bece5f69511f14e0ebcadcb8c9933b7b |
Hashes for rustfst_python-0.11.4-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7892211dee29b770ed400b7b09175f7ecf27eb6176001f19ef7cda4442ec1d9 |
|
MD5 | 06bce1119b15989b7c2511f45a250975 |
|
BLAKE2b-256 | d61b4fa4c4236d74da0d3687b6dfa35a1752294a8686ba6b3e563668c165610c |
Hashes for rustfst_python-0.11.4-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | acf23f17e6a8ad4f3930cb7e71e3aa23da5c736e5ea43b28c3dda61b88412ef0 |
|
MD5 | 70ef068f626fb48580675fe899bcacb7 |
|
BLAKE2b-256 | 81339de1644255b6c460d67b21d00c48b23d557c3bb1cf750633588ef6aa3017 |
Hashes for rustfst_python-0.11.4-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f3e0a9d24e3e46c7e75c1312ec7d31ca5f48794346eab71a7be132447cd782a |
|
MD5 | e2018b444777d0638e28f4bb823ff717 |
|
BLAKE2b-256 | dc47c3d860d3b6e2c0830b59ff8f4172c0e43d03c4a32ba24fdb5742d34715cf |
Hashes for rustfst_python-0.11.4-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c4f036c689300781344e9737e47536c839e32b6ea3facc30d7aeafccb71fc28 |
|
MD5 | 056ed101847cd96dfc2df72c1ace6ffd |
|
BLAKE2b-256 | 7349b9e624705008fc868af8084c2749840652a5ab8b4785b0cd3cfc32f153be |
Hashes for rustfst_python-0.11.4-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 68195ab3d8e6a2eafe069883b8e27fb79afd84415140325a105d9a3f27b4bd82 |
|
MD5 | 4e09602eadbaf6bbe1c37a357bdadcec |
|
BLAKE2b-256 | ca5404ed07d8de421eb19d697c72bf59d7bcc22162ca576734c8fd1c17d14d3e |
Hashes for rustfst_python-0.11.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 971db327a84ee4753a83f63f5933561354b664a782db80efc6bcc9428ed867bb |
|
MD5 | 6e7e3fd3dc42f130d2a106b4f23341eb |
|
BLAKE2b-256 | 9c012bced8b52e31d8de354d28df22ba937d67709af28fb6089c15047942361b |
Hashes for rustfst_python-0.11.4-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9dc6b9926eab6a768984fcb16e5b9d6c60cdf364953498eebb2e5095489d4c00 |
|
MD5 | f2dbc484ec65046d05248965a094d9e8 |
|
BLAKE2b-256 | 56afd2d04c6d2f6d8032fdd3922de18525729a9d337444ec5a56e3e83c8d470a |
Hashes for rustfst_python-0.11.4-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0cd8b6975254c261d71685709ddfe302831080b4cdad1b7457b30f30db719a8 |
|
MD5 | ad892fd94e8b80e3e3ce79883676ae84 |
|
BLAKE2b-256 | 1f1039e6d48516a28198b28b1574fe49bb70bebce458555f911361df26792612 |
Hashes for rustfst_python-0.11.4-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b35f0a55ed03b7015dd8bd06d0c0bf1691c5b61f0d5000829cf9087449f011a2 |
|
MD5 | 40d76ddfedccd612ebf0805c99f7b96b |
|
BLAKE2b-256 | 7b2592e71f9536a02f2c7bb9460bf8800814e48baca492f589a084fa5d1dea20 |
Hashes for rustfst_python-0.11.4-cp310-cp310-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6829570817a8d75cdba2d65e909469419cb99b90c10edf234ca7740f2f96f525 |
|
MD5 | 58dd99ee0365a55a2f58da8a699a20d3 |
|
BLAKE2b-256 | 85cda0f62bc33d51cf0cc6ace2e208410633afad5f3ba9a008f5aa6d1b26015e |
Hashes for rustfst_python-0.11.4-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0508562afcdf42ed31da7b2f6ce839cb35dfebb774e7bc59079e9708f9e23e43 |
|
MD5 | 1546a5f76efd967ce3d7352795ca8388 |
|
BLAKE2b-256 | 4862c74f9d1672d6ef6254a7fbd2af5ed63d18067c1d82ffb3cf3d7701c8f5ea |
Hashes for rustfst_python-0.11.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee2d708d4430417fd8e8e2e4619f71d718d8bcc48113bd7d57129f17aa29ae18 |
|
MD5 | 809bba94e3088154fe2ca4f22c0877e2 |
|
BLAKE2b-256 | 5ad2163003f4d3c1d1172c6fc0a4deb34b19facf2eea33bb22e37fe5829e9c1a |
Hashes for rustfst_python-0.11.4-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f9c5ac7bf30f6822b985cf6e5a16f1d576adc838f952df5684ca20cc989e3e2 |
|
MD5 | 7b0bf5bdcdeafea1f50204c7301d274a |
|
BLAKE2b-256 | e0f23b519779fdd49cf905ef2401e70e51739e9dd6e5c3af1b3f9b2f97b5111c |
Hashes for rustfst_python-0.11.4-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ed47a13c3d56d2166d9158d5c3f617f27ece97d8cccdca7613c55041d769e2a |
|
MD5 | ff3445199d5e222828d4c9d0081e75b5 |
|
BLAKE2b-256 | 0dfafa241150224881de7538ab07b1599e4528eab97353dd1bc05824117aac5d |
Hashes for rustfst_python-0.11.4-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e2353985e9c473b5c1d0405f4b450cc69baa37ffdf24789df1efeab0322a89e |
|
MD5 | da34d6d956e3f02c7481b525cf71bffa |
|
BLAKE2b-256 | 8ae984c7e0edb9af5ce27cce9a817839f4d46889d3adb59092235a40f7576828 |
Hashes for rustfst_python-0.11.4-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b260227254b74d001ee56cf3a0090f2767c6ac5bdec13b15dffd3a561fc8c828 |
|
MD5 | 11583926daee9a64c9ebefd4e8456b5c |
|
BLAKE2b-256 | bd834b437f6910477dc7778470482b03661382969aebad23a565858636b29f3b |
Hashes for rustfst_python-0.11.4-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa792b2daf09d4697eb72bf30f2eacdc3c75465f0cabac31067771f9560dcfff |
|
MD5 | 938a7b4eed744af4fa041c6a1f9df4c9 |
|
BLAKE2b-256 | 1257f813a1094dbe428d9b6b0f489cc2c085da21eb014592a7692411952f3cdd |
Hashes for rustfst_python-0.11.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aeddd7e41b9846be389f93011fae79804d58f76fa3d8cc3b18643a4cbf52d504 |
|
MD5 | 062897d4aa1ed98881144be51eaa4f40 |
|
BLAKE2b-256 | a742cf4faf6137579c92634e876da16d9b815c04db823340ff0184402dfb6e91 |
Hashes for rustfst_python-0.11.4-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd021b965f38f1c54c36c4ef9f3242e2a1e4e19c9465760eeba407560e7d1ae2 |
|
MD5 | 03a75beaab95e3159ea557d72068a658 |
|
BLAKE2b-256 | 526ac2c08b814167065152cf9e6172da4454f0306b8a71268f74885c064a9f6d |
Hashes for rustfst_python-0.11.4-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78f53284c099e3cbd5eb760c9890414d7839bb6a58f9b530fea6465663cb0b37 |
|
MD5 | c16d43cd395069db7816c5829746f2be |
|
BLAKE2b-256 | 0df8b77ee679425fc0b512b570a1d2c3f71718eb1112491e124e9ac19ade0db0 |
Hashes for rustfst_python-0.11.4-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06023797758b96e4209f1e5435a68adebbc2c6257af73f7772fa5f2ff7cb018f |
|
MD5 | 04b6b12e0479e4dcfa6bde3ec331931c |
|
BLAKE2b-256 | 7ce094328cfd907e9c49f427ae90d789ce93c829465ffce4c1249ab26ba2412e |
Hashes for rustfst_python-0.11.4-cp38-cp38-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74b2bfb6889bc34a66a700f9723e97131a6e55a05be87e483ca19ed5ca54f658 |
|
MD5 | ea01b063785edd85cf664a4fad5bca29 |
|
BLAKE2b-256 | 18358e5f1daced62be68011ba838dd75c8528707834b097a34a463d77d0a0a4f |
Hashes for rustfst_python-0.11.4-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf504a667517fbacd7564ca751f03a959c60953184b74a50aea3a344ccab2281 |
|
MD5 | 4c40f1eaa20c1a60b3479cf571c7e77b |
|
BLAKE2b-256 | 539297b83ed507234b50835f62fe60df1dd519a8da2c365f27b119015a09aab3 |
Hashes for rustfst_python-0.11.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa96ad1b0bf67efb7865b0d5c6c95629833a8b3565181b415329e31bd8c8ece3 |
|
MD5 | 7f280de55d9a6d12ef497d5d56a9ad63 |
|
BLAKE2b-256 | fde301beb7fbc99ca89742f928782e9f5b6a501e1fd0263be2984124766632a6 |
Hashes for rustfst_python-0.11.4-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4382a914b76532c48acd7408e02d4faf223764d02252dca4062ff864b2e4eb30 |
|
MD5 | 473103f77a6d825ecd6fb9d71bf64711 |
|
BLAKE2b-256 | 765437496986c2be82af660eb1fb8f0c0d326a659b64da0304395c4c97d4bd00 |
Hashes for rustfst_python-0.11.4-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63b58d766865653aa799cc7f296bd5d0e3fb206aa4a09072ca6738aaef94d57b |
|
MD5 | 628c4b7958292faa3fd2dba58cd5651f |
|
BLAKE2b-256 | 205173276964fd73a06fb9c78e422cfd096e9ca8bf655411fcfa8fb7dd7b0ce2 |