Library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Re-implementation of OpenFst in Rust.
Project description
Rustfst
Rust
Python
This repo contains a Rust implementation of Weighted Finite States Transducers. Along with a Python binding.
Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition's input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.
FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.
References
Implementation heavily inspired from Mehryar Mohri's, Cyril Allauzen's and Michael Riley's work :
- Weighted automata algorithms
- The design principles of a weighted finite-state transducer library
- OpenFst: A general and efficient weighted finite-state transducer library
- Weighted finite-state transducers in speech recognition
Example
use anyhow::Result;
use rustfst::prelude::*;
use rustfst::algorithms::determinize::{DeterminizeType, determinize};
use rustfst::algorithms::rm_epsilon::rm_epsilon;
fn main() -> Result<()> {
// Creates a empty wFST
let mut fst = VectorFst::<TropicalWeight>::new();
// Add some states
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();
// Set s0 as the start state
fst.set_start(s0)?;
// Add a transition from s0 to s1
fst.add_tr(s0, Tr::new(3, 5, 10.0, s1))?;
// Add a transition from s0 to s2
fst.add_tr(s0, Tr::new(5, 7, 18.0, s2))?;
// Set s1 and s2 as final states
fst.set_final(s1, 31.0)?;
fst.set_final(s2, 45.0)?;
// Iter over all the paths in the wFST
for p in fst.paths_iter() {
println!("{:?}", p);
}
// A lot of operations are available to modify/optimize the FST.
// Here are a few examples :
// - Remove useless states.
connect(&mut fst)?;
// - Optimize the FST by merging states with the same behaviour.
minimize(&mut fst)?;
// - Copy all the input labels in the output.
project(&mut fst, ProjectType::ProjectInput);
// - Remove epsilon transitions.
rm_epsilon(&mut fst)?;
// - Compute an equivalent FST but deterministic.
fst = determinize(&fst)?;
Ok(())
}
Benchmark with OpenFST
I did a benchmark some time ago on almost every linear fst algorithm and compared the results with OpenFst
. You can find the results here :
Spoiler alert: Rustfst
is faster on all those algorithms 😅
Documentation
The documentation of the last released version is available here : https://docs.rs/rustfst
Release process
- Use the script
update_version.sh
to update the version of every package. - Push
- Push a new tag with the prefix
rustfst-v
Example :
./update_version.sh 0.9.1-alpha.6
git commit -am "Release 0.9.1-alpha.6"
git push
git tag -a rustfst-v0.9.1-alpha.6 -m "Release rustfst 0.9.1-alpha.6"
git push --tags
Optionally, if this is a major release, create a GitHub release in the UI.
Projects contained in this repository
This repository contains two main projects:
rustfst
is the Rust re-implementation.rustfst-python
is the python binding ofrustfst
.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for rustfst_python-0.11.1-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a49396d49da7ccbedaa2e67a5aa3d35bf64ea8f4e25fc2fdfa45a33608ac8303 |
|
MD5 | e72dea2866b51b13a603af1ceb175b86 |
|
BLAKE2b-256 | a4e43fc0dc2cfb775ea60c1e0020f769171e47d91168b78bec13f636085e3109 |
Hashes for rustfst_python-0.11.1-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a799d90482c72f6b85c83937c3f7626403a5aee5c58e2833927c13069361f58 |
|
MD5 | b398d22dd7604e4c80e76216cbaa70aa |
|
BLAKE2b-256 | 51c5b6f44dfad8a5c7469bcf869108dc65d99ed1bbdfb5c992a45a1613daab72 |
Hashes for rustfst_python-0.11.1-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ebd98291de57b7c7d4f974121b4db346d6937b2078b45cc8441fa0e17a3d156c |
|
MD5 | da206842dc7dca799f303e316b8b1f48 |
|
BLAKE2b-256 | 03670291adeb0d54c4e9823d45bcf4b4d096b4d80df8f577a04f173645633afc |
Hashes for rustfst_python-0.11.1-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3e0e7e693acdd973965d1870fc724db063aab79dc77ab86568a8f6f94d296025 |
|
MD5 | ef11cb0b472e647e69e19000459a2924 |
|
BLAKE2b-256 | 7bb48ef4700698cb38dacb03b701928209bbdb43154c6e53bacf32b09c748f4a |
Hashes for rustfst_python-0.11.1-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0e6dc3e9958c021a37aadb57d500729a68bb0f227e3be03f9ccd6cbd41c79d1 |
|
MD5 | f0530b5ed7fc27246aa17ad2e79dfc8e |
|
BLAKE2b-256 | 6e74ddc53af48c9c8d7851fb3cda8941fa72078c3e3dd4ba3bcf413fdda5d5e4 |
Hashes for rustfst_python-0.11.1-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d831a1c9565ccc584b79076f69f46b7e478bf0ac8892d1cb9b6eef435944b2be |
|
MD5 | c79707ffb8a0dff42b51de07763d3ae1 |
|
BLAKE2b-256 | 9d8a5a171d4109372c8f8c547e4b8d154dda5edc45d91f84b0c7314c1d0bddea |
Hashes for rustfst_python-0.11.1-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc6c7379e9159e361105272e3708e8127d8856a6b7d9b6f577e88f255888279b |
|
MD5 | df49dc22a4604303a2ed4831406f4887 |
|
BLAKE2b-256 | 78be22390ad7a15ae0402478d906aecc695af2749f29f53879c16d745164b304 |
Hashes for rustfst_python-0.11.1-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35207fc46b99982831ad67ccccdd2660fc6e3cd6e4a40a9ef5953e440ed7738f |
|
MD5 | a15059388482d46a2d3eb8e8e2a23f84 |
|
BLAKE2b-256 | f48a5266e263640b809d8d9cc0059b47f24b63f17716a98cf4ed9cbeaa42da37 |
Hashes for rustfst_python-0.11.1-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a08eab04ab031df03f2032c3d77323f4161cd687055f3f88ef0a874565d44341 |
|
MD5 | d361ced70a32bc1187070bad596a46a1 |
|
BLAKE2b-256 | 6c80c7266557ef0b3a07a617ee6f2d1e6dd5e8ebe850ec37e09c24b21f1707fa |
Hashes for rustfst_python-0.11.1-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a28e08dd13b6c4b66d32130fe8b9e9a418f0d4dd03534c480b43370757b70dd |
|
MD5 | 4194065b398b22dc68b376e1262c6979 |
|
BLAKE2b-256 | 9bbcc9d454ac7d863842199d18792253846e84e26f01c4a1d839e56f094708bd |
Hashes for rustfst_python-0.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b98c79772b001c90965c84679986cc690f86a67e249c8820acae4a4f2d141824 |
|
MD5 | 013387a5e598815cf755e0b2ee024171 |
|
BLAKE2b-256 | 8662dcd5ca891fa2c8a6ccc869749b98d7460505a710965a1b1f2728f78428eb |
Hashes for rustfst_python-0.11.1-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 513a4655fc68e1e5962a230f83efbc54b02fa8385016645a89f99ab68e76f298 |
|
MD5 | 3325784520767a38516119603294cdef |
|
BLAKE2b-256 | 97eb1b97e6e5e217879611e18656bc870fad17810847256f6c5543f1d7b8d451 |
Hashes for rustfst_python-0.11.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe000a4510c3a03354270b77b4113937bf5c0b345ca00077d9514f87b3d21420 |
|
MD5 | ba3b3376f7af0544be6f9d228ed00405 |
|
BLAKE2b-256 | 618d41fad43b9cdec09713c24c81cb0b083053f89b749b7f65d9c19450055b82 |
Hashes for rustfst_python-0.11.1-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03ad684146ebbe03d4f05b888507716a5f15b446d49fbb33788a9a09a74c70c6 |
|
MD5 | 452366c82434804aaf00b8093ec1cede |
|
BLAKE2b-256 | 6cc847633c904635ab822a2007698fa09188a3cd93a9fe47f16055195eed362c |
Hashes for rustfst_python-0.11.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6cbc6aed091c69d0ec735fd1e3f223e6729cc43dabdcbde1a5bb0804d34ba53f |
|
MD5 | e4007795f28d2dbb05f9ec3aefef5a9e |
|
BLAKE2b-256 | 46774b94c5d3b2455ede785ae8aa6c69d0c820b917a8f610597f665679f1ee2a |
Hashes for rustfst_python-0.11.1-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa8ac07c84806533d34dd779a6c147a45942d8ce78f7700f15d0952550fee0a5 |
|
MD5 | 4645953b2bcd74d9da7f89d567e4a6de |
|
BLAKE2b-256 | 268b0006ab20816610935699dd43adcd5d2225edd0b6808465cb26c79638ac9a |
Hashes for rustfst_python-0.11.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 069b75b80df49d3f2a5745c9bf589fb10b7b398df4cb0a147d457c97d1305e53 |
|
MD5 | 7e71d5e2394fbdd8456ad4623a1a6cd6 |
|
BLAKE2b-256 | d9c1bf56aef3548b92555e318d69c13ed1bdecc3134f963f2e417a41b58d6e3f |
Hashes for rustfst_python-0.11.1-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 302e314afb30ee130ab669050d34b8d2264236d1881040aeba80b209407dbef9 |
|
MD5 | df815a59e7910282d743faf8508b1de8 |
|
BLAKE2b-256 | 19c3ca02dabbb6efd176f86aa5515741e961248f0420e5e89abec0dfcce966c7 |
Hashes for rustfst_python-0.11.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9a4aa42d99f3eb9c96d51ec47b101190e0ac04dfca10f0dd2817727ea8d52a08 |
|
MD5 | 1326a5d831c79c3fa508bf10f4fa2992 |
|
BLAKE2b-256 | 3de97a887642503342f08c5f8ef83d338cae2c328bca40012bd2e5523161c615 |
Hashes for rustfst_python-0.11.1-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ede89edbbe1e38ec0260a9349675d4985af1b991e4512f2d88ca5fa58e19d288 |
|
MD5 | 8170189a6134b57051b22b471a433c1a |
|
BLAKE2b-256 | 292370a21f7c46eb4d0da101251e08ab419271e52b69e9eb06e05f2c23fc57d1 |
Hashes for rustfst_python-0.11.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a54e2e8ed3ec5495eb103c763550e198abb6a86947020896f10ce8dc0110b29 |
|
MD5 | 81f357bff0b2cb370f0af531087d2dbb |
|
BLAKE2b-256 | db13134f1c236f59dd7182bd57486b212c0d6dff4f99701f4d90a935bb72bc35 |
Hashes for rustfst_python-0.11.1-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d29b8b5be9760fad69f0ca9b59f66ad28f4d2fe3d082ed864b0a683358b1540e |
|
MD5 | f68d7fa18a2fa4ec7343d2ffb30dafd5 |
|
BLAKE2b-256 | 810fd962e7c75e3d4c79aefd837c828669cdbbf4c73dccedc45b9158944529a8 |
Hashes for rustfst_python-0.11.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e57236b461c48bf0adb688eb69e1adf0d3c4238183f37b5b5b0f012fd5213ee4 |
|
MD5 | ae2a08ecdd58d1c5d7750a330f60f7a7 |
|
BLAKE2b-256 | 05377e3931dbc23752fe3ae0987a17e0e1b34f028da6499646f57b0c934de359 |
Hashes for rustfst_python-0.11.1-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 094609d0a1c4db21988c013293876b4d80b9acb73333b97a47c607b4a3325603 |
|
MD5 | 2532b4e5610936a1a6ee6f09d53afbfe |
|
BLAKE2b-256 | e64c8f6e0027986963bf4fec4394497ca163cc13ffa5155dd7139e0d396fe975 |
Hashes for rustfst_python-0.11.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35f25ca8a7727db296fc84494212efb9dabe1629ccd0c6ae2c2f9e6d1039776b |
|
MD5 | ab628d66a453d9d5094e3737ec5b1655 |
|
BLAKE2b-256 | dbbf83cda669171c4a535c02e1337ab617675eed7d895c12c4a9c41af567acf7 |