Library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Re-implementation of OpenFst in Rust.
Project description
Rustfst
Rust
Python
This repo contains a Rust implementation of Weighted Finite States Transducers. Along with a Python binding.
Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition's input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.
FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.
References
Implementation heavily inspired from Mehryar Mohri's, Cyril Allauzen's and Michael Riley's work :
- Weighted automata algorithms
- The design principles of a weighted finite-state transducer library
- OpenFst: A general and efficient weighted finite-state transducer library
- Weighted finite-state transducers in speech recognition
Example
use anyhow::Result;
use rustfst::prelude::*;
use rustfst::algorithms::determinize::{DeterminizeType, determinize};
use rustfst::algorithms::rm_epsilon::rm_epsilon;
fn main() -> Result<()> {
// Creates a empty wFST
let mut fst = VectorFst::<TropicalWeight>::new();
// Add some states
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();
// Set s0 as the start state
fst.set_start(s0)?;
// Add a transition from s0 to s1
fst.add_tr(s0, Tr::new(3, 5, 10.0, s1))?;
// Add a transition from s0 to s2
fst.add_tr(s0, Tr::new(5, 7, 18.0, s2))?;
// Set s1 and s2 as final states
fst.set_final(s1, 31.0)?;
fst.set_final(s2, 45.0)?;
// Iter over all the paths in the wFST
for p in fst.paths_iter() {
println!("{:?}", p);
}
// A lot of operations are available to modify/optimize the FST.
// Here are a few examples :
// - Remove useless states.
connect(&mut fst)?;
// - Optimize the FST by merging states with the same behaviour.
minimize(&mut fst)?;
// - Copy all the input labels in the output.
project(&mut fst, ProjectType::ProjectInput);
// - Remove epsilon transitions.
rm_epsilon(&mut fst)?;
// - Compute an equivalent FST but deterministic.
fst = determinize(&fst)?;
Ok(())
}
Benchmark with OpenFST
I did a benchmark some time ago on almost every linear fst algorithm and compared the results with OpenFst
. You can find the results here :
Spoiler alert: Rustfst
is faster on all those algorithms 😅
Documentation
The documentation of the last released version is available here : https://docs.rs/rustfst
Release process
- Use the script
update_version.sh
to update the version of every package. - Push
- Push a new tag with the prefix
rustfst-v
Example :
./update_version.sh 0.9.1-alpha.6
git commit -am "Release 0.9.1-alpha.6"
git push
git tag -a rustfst-v0.9.1-alpha.6 -m "Release rustfst 0.9.1-alpha.6"
git push --tags
Optionally, if this is a major release, create a GitHub release in the UI.
Projects contained in this repository
This repository contains two main projects:
rustfst
is the Rust re-implementation.rustfst-python
is the python binding ofrustfst
.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for rustfst_python-0.9.1a12-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d6ea3ac73809e88f9ded54ee5aabc1b672e32d04983cf07525da2d6aa6b3fd0f |
|
MD5 | 35b44504af2f8bd7204dbaa255b5904f |
|
BLAKE2b-256 | 1555f53b90a8fa7c040774ec481d162a853e039e8b7e957531d1d469961e3559 |
Hashes for rustfst_python-0.9.1a12-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 455b1a29229217075915f0b0174864064277036925b116a0af9158cbef07ec86 |
|
MD5 | 7bda190664257fa31fa799513a3d96ce |
|
BLAKE2b-256 | 2cbed65ad57a928c40645c490410e2683265ca52eb01736220717d790c85d521 |
Hashes for rustfst_python-0.9.1a12-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69729d1daece0cff2cf09ae9b72314b53dcaa841b933e322dc288f1130ac3a5b |
|
MD5 | ea8c57332a7a5e1152e70d7ed04cf528 |
|
BLAKE2b-256 | 9b4840b1ce07509388045ea01f2fb5e846c2d79eaab61b921a42ab5f4f12e5ea |
Hashes for rustfst_python-0.9.1a12-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 23886edb807bb0751c606bb7ac04f6029c9323be495cd9f91c01a548501acc81 |
|
MD5 | 0a625ba2fdb081b340d9a69942b34d04 |
|
BLAKE2b-256 | f033f1f278e99373ee5d10b49d71c79518bef7f6be3d2a8bc38e3c5f49db64fa |
Hashes for rustfst_python-0.9.1a12-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 985d6ecf21a3f435da6ac1b310976c04de418d77eb871fd49980e9dbeb9b31b0 |
|
MD5 | 3f64e906d12550d756c2abd0c04337ff |
|
BLAKE2b-256 | 9825c09ecf0f48bc3382453d522226bf01c3fae4efe2241569c2e4461319de93 |
Hashes for rustfst_python-0.9.1a12-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78eda657e9699e3fce35c204b461ea5bb9907aa903252810000444cb246e4fb1 |
|
MD5 | 78d524ecedce956fe9ed63ffde490582 |
|
BLAKE2b-256 | 5def43f0eec478cfa8c002387fff36fdfb8b1ded9a2d3ef11aad66499285d538 |
Hashes for rustfst_python-0.9.1a12-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2644ff8e855cb8b1b209464fee33f5732eb56688c63651421a928fefe6333698 |
|
MD5 | 66d90451e37ab86c2eae39186e26a2f0 |
|
BLAKE2b-256 | 73b1e38333a36e6da6988448e89f1a8ab5f69201624166317963549fc3b5d2b8 |
Hashes for rustfst_python-0.9.1a12-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71002a1ed66406accc4c85bbb367c3703026d31bafd609824fb8ea3c70ce51ef |
|
MD5 | 333f58702f61ff59e52477a8f6b9579e |
|
BLAKE2b-256 | 9037c6bcd8ac4ae3f9358cc9955d7e50cc0c2c092f1daf2df14d6618e2300887 |
Hashes for rustfst_python-0.9.1a12-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e52318e85f0f3a20823ea9547b7304962b955895017cdf0e5bcc37b18ecb59ee |
|
MD5 | b1e5a1a24ce9cedbc278783d9819e1b6 |
|
BLAKE2b-256 | 803445fcebf9f1dac9092ffe6201cf039e1a0805ed9c0bb463813990aa1fc045 |
Hashes for rustfst_python-0.9.1a12-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1874c67d5b7ac1151261934e8f6ba5511a601a15eb45d976d2d737165f408fa2 |
|
MD5 | 389779e20b6abbd19caa99b6187b3f66 |
|
BLAKE2b-256 | 02d6ef11e8d0b824f3702e7a7538ea9c0c70ad15591ed56acdd709608484f649 |
Hashes for rustfst_python-0.9.1a12-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6075c30439718ed03ad1ca1f09a1a2e045df56a718b1cb35dc07f60dd0b733c |
|
MD5 | 887c55f58babd5bdc4a5a76f75322428 |
|
BLAKE2b-256 | 819eecd6f32587ea9b41c70b1f546b8acca9615a968ef007f218a566eee80d2c |
Hashes for rustfst_python-0.9.1a12-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a7cbc3ca67437d9fe6bb44796c7470d9f1ac15ccc603d2f6dd02a43bb378cd1 |
|
MD5 | 7dc81c4347402671d3ce0075df57a515 |
|
BLAKE2b-256 | 4c00d768dc4584900f43be01c9133f9fb5a106945ea7f1bed13ba3b6af00053d |
Hashes for rustfst_python-0.9.1a12-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 114a2833ffa08a51f09728566b33f2438a505982610d273bdd1d35dd219c72a4 |
|
MD5 | 4a2b6de8212ed4f7b76514702cea43f0 |
|
BLAKE2b-256 | ead665389ac9c886dd822f96e12782dd003d2398af6dabee66b468876a350f44 |
Hashes for rustfst_python-0.9.1a12-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 92889088bc6751121a29063ae1fe62f5ad3e7243e2a9c024c07acf79252127e4 |
|
MD5 | efb13625da5799ffe77ff540f2f40f1d |
|
BLAKE2b-256 | 59c730490530476aef7ef418ced9b289d8f66c13d65f793319b011ab9fabf4d8 |
Hashes for rustfst_python-0.9.1a12-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b96a3a2acad6c488a069e0f55508fa325ff60d2d5a425b74d8383bd7a8a421d |
|
MD5 | e52bcd02f620547e11ab0619c5287f1b |
|
BLAKE2b-256 | 3e58a551f3f8ea502f3079ac3a524223d6c2a51dc7eeeda23df1d935c8679593 |
Hashes for rustfst_python-0.9.1a12-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a4260021b12a3289b3b376203426f726ef980298cce913e3607a5be271db7cef |
|
MD5 | 36525afe785092c597b66a63f8f86acf |
|
BLAKE2b-256 | 07ab31f333ed90be27d58e76594986a48cdfa760a322589c62495fb190a9e66a |
Hashes for rustfst_python-0.9.1a12-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e0659f45a44b18805f21f8ce92bb68e23d17f46cb1582ad1dbc4739d7f352f8 |
|
MD5 | 6b14fb001879bb0dc54109ec94aafa78 |
|
BLAKE2b-256 | 81c8c453b2dc88c260bdb7037ae49b435dc4480e57e8666c96eab4351c4ecb0a |
Hashes for rustfst_python-0.9.1a12-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1e11b94856e523eac3831e3cebc5105b696d726adaf476da7723c4ba71731dc |
|
MD5 | 3c25514f5081d8225b47204e2d287e4e |
|
BLAKE2b-256 | 5b492d6ccad797f0c6c82d426bb7671f7288b85f8be7e6ea91630297d12b73b1 |
Hashes for rustfst_python-0.9.1a12-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6d5939c929b3b0148c0621e746ccda3b91d8aefaaad0dd7a62e96fe5421d61d |
|
MD5 | 1f2a5dae45260091c9a9d99df257a012 |
|
BLAKE2b-256 | 7a2e441dd8ede96a5448d71d85c45e11180069d30154c4acb70ce0cede9fdef5 |
Hashes for rustfst_python-0.9.1a12-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 08fa5fcd64baab80aeb4d9c478d87982e5cbccd5fa984d68e054b98ecadc8123 |
|
MD5 | 35c6543c79a018c69a23f0e974a137bb |
|
BLAKE2b-256 | b42e83404caf2973a172374482695380ff337c300c60156fdcf29cda97aadfcc |
Hashes for rustfst_python-0.9.1a12-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7fb565fe8e79a0e12aaa5fe71e2422ff84abca861c2c1f038c7fbc10a7f0ddf1 |
|
MD5 | ef906251a3c1075ac0f1708ce3e33fc7 |
|
BLAKE2b-256 | d21bf22814a5542c0b22b505f78fbfed0d98981eb789c8722881603c1c503393 |
Hashes for rustfst_python-0.9.1a12-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35b44b8b909052e6d80217414e9a8264146789487e1ee10bcda84162627c03e2 |
|
MD5 | 88964e0651eacd08bdc3708578efa718 |
|
BLAKE2b-256 | 75c49b097eee36edb539df5495b2fce2a7e86c8453b1b8e6a6c8e951e4c46f58 |
Hashes for rustfst_python-0.9.1a12-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 257d5221f141d85d2b422ce52a5be8011987d9327b56f617ec4f1fe8f00394b4 |
|
MD5 | a20cf0f8e917b6b9e32d91627bc6b467 |
|
BLAKE2b-256 | b6dbe853c397e643a43eb88ac741d9748cc34a1b177b9df58e87909eb33082c1 |
Hashes for rustfst_python-0.9.1a12-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3167f0b9401a3ad715fa9085cf14d21fe8d0fc4627f6d2c2473f60ae0ff27b56 |
|
MD5 | f2f17ba32a65baa15b232863f9ba2576 |
|
BLAKE2b-256 | 7f7ca8ed7200869c32d8d538c9694acbec6f89c873169ac810f101802190e72f |
Hashes for rustfst_python-0.9.1a12-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | db6fba62950ccd1510bfd00c78797e299c325acf87524681fdfcd4d16986f14a |
|
MD5 | 91187d42fe9a2e23d4adf9575994f73e |
|
BLAKE2b-256 | a89bf802f6899ba370aa9170f61f9114b8514011c02d9de0bbcfa17326519968 |