Library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Re-implementation of OpenFst in Rust.
Project description
Rustfst
Rust
Python
This repo contains a Rust implementation of Weighted Finite States Transducers. Along with a Python binding.
Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition's input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.
FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.
References
Implementation heavily inspired from Mehryar Mohri's, Cyril Allauzen's and Michael Riley's work :
- Weighted automata algorithms
- The design principles of a weighted finite-state transducer library
- OpenFst: A general and efficient weighted finite-state transducer library
- Weighted finite-state transducers in speech recognition
Example
use anyhow::Result;
use rustfst::prelude::*;
use rustfst::algorithms::determinize::{DeterminizeType, determinize};
use rustfst::algorithms::rm_epsilon::rm_epsilon;
fn main() -> Result<()> {
// Creates a empty wFST
let mut fst = VectorFst::<TropicalWeight>::new();
// Add some states
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();
// Set s0 as the start state
fst.set_start(s0)?;
// Add a transition from s0 to s1
fst.add_tr(s0, Tr::new(3, 5, 10.0, s1))?;
// Add a transition from s0 to s2
fst.add_tr(s0, Tr::new(5, 7, 18.0, s2))?;
// Set s1 and s2 as final states
fst.set_final(s1, 31.0)?;
fst.set_final(s2, 45.0)?;
// Iter over all the paths in the wFST
for p in fst.paths_iter() {
println!("{:?}", p);
}
// A lot of operations are available to modify/optimize the FST.
// Here are a few examples :
// - Remove useless states.
connect(&mut fst)?;
// - Optimize the FST by merging states with the same behaviour.
minimize(&mut fst)?;
// - Copy all the input labels in the output.
project(&mut fst, ProjectType::ProjectInput);
// - Remove epsilon transitions.
rm_epsilon(&mut fst)?;
// - Compute an equivalent FST but deterministic.
fst = determinize(&fst)?;
Ok(())
}
Benchmark with OpenFST
I did a benchmark some time ago on almost every linear fst algorithm and compared the results with OpenFst
. You can find the results here :
Spoiler alert: Rustfst
is faster on all those algorithms 😅
Documentation
The documentation of the last released version is available here : https://docs.rs/rustfst
Release process
- Use the script
update_version.sh
to update the version of every package. - Push
- Push a new tag with the prefix
rustfst-v
Example :
./update_version.sh 0.9.1-alpha.6
git commit -am "Release 0.9.1-alpha.6"
git push
git tag -a rustfst-v0.9.1-alpha.6 -m "Release rustfst 0.9.1-alpha.6"
git push --tags
Optionally, if this is a major release, create a GitHub release in the UI.
Projects contained in this repository
This repository contains two main projects:
rustfst
is the Rust re-implementation.rustfst-python
is the python binding ofrustfst
.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for rustfst_python-0.11.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99580415ac4ad0d02650060774a4281d53136cf7df122807c5cdfcc0f91b0f38 |
|
MD5 | 3d17ee16f416329a81014067666b8915 |
|
BLAKE2b-256 | 141d9f20ee37f3f6e2fc55f4658815c3037e81669186b9647d6fbf013fd8850b |
Hashes for rustfst_python-0.11.0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59b619ce4cecebbffca00f5825c259847c1d233ec0bedededcd87167eb52aaa9 |
|
MD5 | 0040c9965b0ff03e820db65869fadfa9 |
|
BLAKE2b-256 | 67883bb41f2b407df191d1f575d88428d211b4b207c45b1b93faa09f85a7e3b3 |
Hashes for rustfst_python-0.11.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f366b2ba884979010aa20e927665d759d84f2a972edef6bb8493089e597a33d |
|
MD5 | ac000e5331d573051a124c81c84811b2 |
|
BLAKE2b-256 | 52c9cda36ddddb98eb6c8554d3d56a8b8ea0f8e43e87fd1df8920896ddcdd0c0 |
Hashes for rustfst_python-0.11.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b0f32198f80f7c177113bdc1974b88c8c1d5d9b730c81de2612454c7626de575 |
|
MD5 | ee93e0a3794d004ff2271e8c76c8f607 |
|
BLAKE2b-256 | ed2bdd87ea0c83e07e2533102a339424131183df8173b339251f66d7cae72540 |
Hashes for rustfst_python-0.11.0-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 538bdfcd9c8dbd1a45f3a95942d99069c6585607a356284a877d228c0f6d87a2 |
|
MD5 | 639b089fbe9da01851b4bf5174db3016 |
|
BLAKE2b-256 | f63f67f7cdeb87f0c2f1fea2fcd92a063009cc301bbde058d123be18485561b3 |
Hashes for rustfst_python-0.11.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | df61b96d867ec63b360c3b952a8e730f1bc21494782afbefb8c902b17590ac44 |
|
MD5 | 1a87b8c52672dfe6c4dd5a141b578807 |
|
BLAKE2b-256 | 986659ff4384a34c0b97d244b310fd45d8feb202fcbedc7db018bd9604ab224f |
Hashes for rustfst_python-0.11.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf40d1c1cb73d083ca51fa48a799e5a24e5a899459ccbc75266154029d552b99 |
|
MD5 | 53774240ca20fbf1df0b0ac65e407774 |
|
BLAKE2b-256 | a3dae893eef5a61c6d6501c8b59320fe004104e2786003e7ed3a07183e3c40b6 |
Hashes for rustfst_python-0.11.0-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e61d7b6e6e92405540b50eea2d46d50cda3db30117c7efc94881a7d5836b2ba |
|
MD5 | 059f8e2391ccbaa98cb98f969563d474 |
|
BLAKE2b-256 | 22d61aebab6fb7a1b0432cfe0f9fb58be15627008cba613e2aabd27c691dbf1e |
Hashes for rustfst_python-0.11.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d52fb2523567fff9bc4e419f0c7c28d95b82c36214552d1a0a77334561999424 |
|
MD5 | 36e913e34c9b8217b54e5310c78dc3dc |
|
BLAKE2b-256 | c72263e0a215c610bee798689b2a004b5e93e4a5305d33a3a25d5d4a7b6cddab |
Hashes for rustfst_python-0.11.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 857b2ae4bcc86e6f16cbaa9b660678128bae3d032f79f7fbfd3a854b46ead979 |
|
MD5 | a2e5721f9b469e6441655e98300dfe57 |
|
BLAKE2b-256 | a2a0e4771c0b256a0a44ffbf69b4b1dfe73abcb96dd59afff8749b1b852cc741 |
Hashes for rustfst_python-0.11.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78a89c3e388075e2cf23a50bf6011f36b06e097f0cfea7d2affcbae47408fe58 |
|
MD5 | 8dc5908026c3f12c47cf0adf3985d885 |
|
BLAKE2b-256 | acbf5a27990e8918ed3772bc4194584fc1cbbfb0d00d069cb8c10e8f3044f3cc |
Hashes for rustfst_python-0.11.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40697358a709273f980b31312a756813685b9a969d316f6d2dff3917d7c1264b |
|
MD5 | 48dab96352a6a709abdeda0498dcec7c |
|
BLAKE2b-256 | 3e28e28ef265a0bfc5f0ced1427005aec56a18c2217e58d0cabc097330aca717 |
Hashes for rustfst_python-0.11.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | db539e82cca34ce064f32804a5f315fbddd12049bf6c27a99329fbc13790f08c |
|
MD5 | 2d2167ed47c35ca7ffdd624925d1f63f |
|
BLAKE2b-256 | 3aea0ec09494bf2d6acad7b48d28c05f72265f1e097a29f333dc64359accc9ca |
Hashes for rustfst_python-0.11.0-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | db8632a9af1ee37a9924b4294c6e261e562483811daafd5a20dee1f3fe559973 |
|
MD5 | d3482714378baa85723987df8598536c |
|
BLAKE2b-256 | af6fb096e742618f9a25698f8e2ff0a94fba9a424dc7a0541a377410bfd3db28 |
Hashes for rustfst_python-0.11.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 70da441a15377abef9fba757521a0bb30867845aa88dc2b6cfb249df680f020f |
|
MD5 | 511f5922506caee82d9b30ece8af6369 |
|
BLAKE2b-256 | a08c4d20714bd792c5bc6e53e73dd7854e72469ca26982adaf7b84a79f6a7a29 |
Hashes for rustfst_python-0.11.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10c27475a61be0addf35fd52929f486e6fa287a381c3249a6c320cfbf2b134bc |
|
MD5 | 4cb595751080915274d21fee450e5ee1 |
|
BLAKE2b-256 | 4ad38cd0e19a5bc32798a54b4469d7d505ce6c8c1d84421fc0cfd22f9b23ab5e |
Hashes for rustfst_python-0.11.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfe7202f838dd26301bbb6ddf90aee003a9e866c1a7ef1525e733836473c9f58 |
|
MD5 | 441ba34e035a68f50de35f322e742869 |
|
BLAKE2b-256 | f7997016c62eebd1bcda1f13446705f95c451a344bfe508db0071f94990c32fb |
Hashes for rustfst_python-0.11.0-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f7915e5bbea37daba9a5e607ef401c762dba671101c2719ab9c7f4a41a9d3d2 |
|
MD5 | 45f2dbc8f6b8235effcf16b115379e87 |
|
BLAKE2b-256 | 9cdebef0735e0b197e01c39b33396e3a0636a8703ea12d147e734d1d5fbe3e05 |
Hashes for rustfst_python-0.11.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e088490541278b61b49bd37077aecd88de83e65443d928d911d90ea884b0df3f |
|
MD5 | 49ee1b3d318b088cce2c568542bea27c |
|
BLAKE2b-256 | aeec4b0e96da00a0cd98535866e285d7a2896de8909049003cf0bcaa2cd5acc0 |
Hashes for rustfst_python-0.11.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4a748ca47edf25f05da3fc417b641bef4547bc522f01d87ebb32646cbe58095 |
|
MD5 | e32e045c36e0bdecfa423bb4abf3a6e9 |
|
BLAKE2b-256 | f28567dd4acc96763ff4f995151d28d08c343852c5a09b59cc8136b0912bff6f |
Hashes for rustfst_python-0.11.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2f99fc8ea0d1eaf5fa27421a73ac13d863e738125c98273ed3e5e9aa4df1fae |
|
MD5 | f77e8ecf0e0e05981a8d0aca8b374a24 |
|
BLAKE2b-256 | a0e6a531aa1306f05f4b7929c500a1d37096b6c8ba031eebce6fb569fc063208 |
Hashes for rustfst_python-0.11.0-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6559ca99df851159678e6465067d4725a46415f02c78ddc27da42197b5dc79a3 |
|
MD5 | 539f5ee281879a00023dfd84aec319d9 |
|
BLAKE2b-256 | 4d0015973b59fedd74945897cee81a70f47268f8b3ddeca90ae0e2ea833ed3d0 |
Hashes for rustfst_python-0.11.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 188bf16175dae8e41b1e747cd13efb105808e8576a55d0acacbf4b2ee9dae631 |
|
MD5 | 9905dc3d0517c31fc170fe25f39224b2 |
|
BLAKE2b-256 | 60f5d84eac75e2b7de32d6d2b8664a39d01880fdbe5d80efba692e4b865b957b |
Hashes for rustfst_python-0.11.0-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bda6b0f97012e588372f8b23c784443db0f69111724a4bdf0851d414c4968151 |
|
MD5 | e9b5c7c225aeb5e2a9c8fdf3b102f791 |
|
BLAKE2b-256 | 0c8586eefe2cb217b0c899f7479bf17b34426328f910f84754db6b16965f59c6 |
Hashes for rustfst_python-0.11.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4acec3ec6639178a3037915782841bfd7abce21efa50fa502d3541aae399ec8 |
|
MD5 | bd9802337e12d149fd43b9965a915f91 |
|
BLAKE2b-256 | dd40ed5692d9cc9d13d9b3a9011ca37db727580633a20261cc00f10c42178219 |