Library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Re-implementation of OpenFst in Rust.
Project description
Rustfst
Rust
Python
This repo contains a Rust implementation of Weighted Finite States Transducers. Along with a Python binding.
Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition's input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.
FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.
References
Implementation heavily inspired from Mehryar Mohri's, Cyril Allauzen's and Michael Riley's work :
- Weighted automata algorithms
- The design principles of a weighted finite-state transducer library
- OpenFst: A general and efficient weighted finite-state transducer library
- Weighted finite-state transducers in speech recognition
Example
use anyhow::Result;
use rustfst::prelude::*;
use rustfst::algorithms::determinize::{DeterminizeType, determinize};
use rustfst::algorithms::rm_epsilon::rm_epsilon;
fn main() -> Result<()> {
// Creates a empty wFST
let mut fst = VectorFst::<TropicalWeight>::new();
// Add some states
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();
// Set s0 as the start state
fst.set_start(s0)?;
// Add a transition from s0 to s1
fst.add_tr(s0, Tr::new(3, 5, 10.0, s1))?;
// Add a transition from s0 to s2
fst.add_tr(s0, Tr::new(5, 7, 18.0, s2))?;
// Set s1 and s2 as final states
fst.set_final(s1, 31.0)?;
fst.set_final(s2, 45.0)?;
// Iter over all the paths in the wFST
for p in fst.paths_iter() {
println!("{:?}", p);
}
// A lot of operations are available to modify/optimize the FST.
// Here are a few examples :
// - Remove useless states.
connect(&mut fst)?;
// - Optimize the FST by merging states with the same behaviour.
minimize(&mut fst)?;
// - Copy all the input labels in the output.
project(&mut fst, ProjectType::ProjectInput);
// - Remove epsilon transitions.
rm_epsilon(&mut fst)?;
// - Compute an equivalent FST but deterministic.
fst = determinize(&fst)?;
Ok(())
}
Benchmark with OpenFST
I did a benchmark some time ago on almost every linear fst algorithm and compared the results with OpenFst
. You can find the results here :
Spoiler alert: Rustfst
is faster on all those algorithms 😅
Documentation
The documentation of the last released version is available here : https://docs.rs/rustfst
Release process
- Use the script
update_version.sh
to update the version of every package. - Push
- Push a new tag with the prefix
rustfst-v
Example :
./update_version.sh 0.9.1-alpha.6
git commit -am "Release 0.9.1-alpha.6"
git push
git tag -a rustfst-v0.9.1-alpha.6 -m "Release rustfst 0.9.1-alpha.6"
git push --tags
Optionally, if this is a major release, create a GitHub release in the UI.
Projects contained in this repository
This repository contains two main projects:
rustfst
is the Rust re-implementation.rustfst-python
is the python binding ofrustfst
.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for rustfst_python-0.11.2-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb6ebaa29d95b2655b3300dc5e62c09cdc9df123c0ace7eb80bb971c7470d72e |
|
MD5 | 71654112b5fd1b8cfaa11fcac30ba8d1 |
|
BLAKE2b-256 | 54966750c33cc64458599f4c308d8ba2531067ec41bc6291536937b604eb19d9 |
Hashes for rustfst_python-0.11.2-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d60b17052141516dd658aa53625da0dd813b08c3eb6c9eaf70c1d60de0d75cf8 |
|
MD5 | e63d27738e15cdbc439717ad4e546db5 |
|
BLAKE2b-256 | 5ba7d9a477e418219af1c300a83f87cd940052e359d815703e1de28f005a4dee |
Hashes for rustfst_python-0.11.2-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 68aa4f1d879c8a864dd60741c9a0a1061843a8d2ab0fefb7bd6032b09ea57561 |
|
MD5 | b0a9dcfc7e3244087fce537850fd4b15 |
|
BLAKE2b-256 | 59f7c6091f9a9a56b7d7072240c946920068608d21f8ff7f876397083bfde18c |
Hashes for rustfst_python-0.11.2-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 300a905a2e8ae0fd7ffd27e6c4e318678b1fc87d591b836b107c4ff0e650d2f0 |
|
MD5 | 57aaaf9d67cd5041d1b8d689903f4ee7 |
|
BLAKE2b-256 | 27e0cfe885801f0684dad7fe3408df9b741a5a3b6063c9169e4267d7de2f83c6 |
Hashes for rustfst_python-0.11.2-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39d9f3165fcffa00e2757ca4feeb3f951b66c1857d7235c479ae70e168f460dc |
|
MD5 | 00a4d22562700d4193a6bbde2ea0e71a |
|
BLAKE2b-256 | b5303b84def4eccc3961faf00eb9cfc55defc3a99c52e074e1bf08e3c69f4c8d |
Hashes for rustfst_python-0.11.2-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fbf521f410682f2380e510657f53ad3f6c2907a429fbac16820206d62a20430c |
|
MD5 | 304671a163e969e1120327c3bb32ac52 |
|
BLAKE2b-256 | 0c34a94f62095fbc49940aafc30a74059001d4434e68f747bb66c2492c3cafd6 |
Hashes for rustfst_python-0.11.2-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7413a23eda9c74e0c40fc9693606e8cfe84ca961083bb58aee88d977c86b6c6b |
|
MD5 | 7111211852f0fb7347f3741efb450bce |
|
BLAKE2b-256 | c54709aeeab6d9524701878a86b9ffa3fe233844d6fb8a683ef54abd751f8d7e |
Hashes for rustfst_python-0.11.2-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb3ed98e8e0030abf055347e9e499156658780f78e6ed01845cbc05a8bfd8f4a |
|
MD5 | c3851ef9c9358f685a54dc08a9dc3512 |
|
BLAKE2b-256 | c33ceae84b6d650099468cdb364647ceea122e5be6bb6236a5d64ca4df0ecde0 |
Hashes for rustfst_python-0.11.2-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 908b1228e76731201fd65ef6bb423f4cb409577e343019ff6e164b86b8e97091 |
|
MD5 | 099743be1d3d0273ef25e810949cf357 |
|
BLAKE2b-256 | 3bed20ffe1f35785306bab3e217f79664ca37edaccb361e5bdf9aa9497e04324 |
Hashes for rustfst_python-0.11.2-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5701850bea93fa8f143f36803bceb5748b64f536d0d27bb1b904becbc3852ba1 |
|
MD5 | c5c1b84d77102fbf3aa9d4e74127459a |
|
BLAKE2b-256 | 9596bcd3b443b9a0094e9381a5dd67af84a75801e80553fbc06e9c1c71965354 |
Hashes for rustfst_python-0.11.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45d4ea58648f38cdffc3b5fa878777c12b2cef0552a88ee901d1b8010baee696 |
|
MD5 | f06b00a32901b2ad7ff5b7d3166ff574 |
|
BLAKE2b-256 | 44a0c9ee7c7571afd51d584af65a3f9364325cdbc09f54e7fdc53ca86309bb25 |
Hashes for rustfst_python-0.11.2-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15222ec12671dbb489c1731a4ad6476b7316e08bedada8fc4cfe7fb37e791042 |
|
MD5 | df072bb1195111b2e7be010b95aef663 |
|
BLAKE2b-256 | 27a322200c859af9bf6a78babf277fc6cb21b6fbc730b22b3096fac0d0fcf916 |
Hashes for rustfst_python-0.11.2-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dda39d50d07d5995074172110d03145f31221e6447242dbc3d4a4989ad13695f |
|
MD5 | 17b14433f98fa4354877e0a465b501db |
|
BLAKE2b-256 | 87433daf68c75d0f86f84a1a3f80b31091d43467fe5e5d58e1625bbbef8604d5 |
Hashes for rustfst_python-0.11.2-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 326777b9279c64ac23cbf81cdab8f7a753cc5fb0efda61f0f3cf72c35b5c275b |
|
MD5 | aaa8f6c998b07d37fc47ceb8d0ce98ee |
|
BLAKE2b-256 | fe40768c77a8e5ce10d9baf3748693fa79d8eb9f0827a7dda79193bf073c14a9 |
Hashes for rustfst_python-0.11.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0bfbd59c800b5d754e0dcbfb83907d0a3b8d82e5a0e855332886d4b34f98935 |
|
MD5 | fb1c104dfb6c2fefb3f920336c0d3a01 |
|
BLAKE2b-256 | fbce63cd56fb517b6592dc56981a33d0fcf3404a2e2568073eb4cff35dd232d6 |
Hashes for rustfst_python-0.11.2-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f8185d75335e799b6adc136250099b321910c1085698abf2627febb4530f0967 |
|
MD5 | a81cc8974a36e20de3a8b8f979463891 |
|
BLAKE2b-256 | 78efa02af0d394e7134b3233de525c0780aaa0beb0b16f20b8375944ee9add07 |
Hashes for rustfst_python-0.11.2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fda9ab294b9a83ade77dd16bba27ec8daa075517681eb8f6cf41b1c2192dfbfd |
|
MD5 | 38d3e65b1d12558f8b20b97bbc899d64 |
|
BLAKE2b-256 | 57c973a9a4cd429dfd45697206126219c5c7bfcbef125d4ffb33ec82d020a710 |
Hashes for rustfst_python-0.11.2-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | be704802468f97ec7b766180c8f8c32ed083264f9c0c51c9721dcf462de59388 |
|
MD5 | c3cb68fe1dc26d72f828bd716ed13acf |
|
BLAKE2b-256 | 7d7ab04a743897503ac3e895cbd54b02040c139a68ad075ab06a7d9ba5868b00 |
Hashes for rustfst_python-0.11.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1e1f1ba446209bf8eaf71b67d5644c503173e8a6097b720bdef6dd994c028879 |
|
MD5 | 45b9ae3a53ff02a3f27c686f1243f833 |
|
BLAKE2b-256 | d4f7003b29c748582272895ea8e34cd7ed07433af60464d128bfa249c2cdbfe0 |
Hashes for rustfst_python-0.11.2-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 08d5aef6e8e2b356fff3cadb70442810d043da44328fcfac985210533bf6291b |
|
MD5 | c04965dbb59f74272bf0f2b9342c2bc6 |
|
BLAKE2b-256 | dc723d952463a1678c98345bbe303e48d093b7a946d6eaf2fc36459287ee3951 |
Hashes for rustfst_python-0.11.2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ef02379b17068fdf6162e634a84bbc2d26875e6c1526ac3e18b0eed2e389765 |
|
MD5 | cad48e1512ea977702c2ca8ab3426ef1 |
|
BLAKE2b-256 | 8900fed80a154ad08caec1d832b50ba1ebe309b5292e6c2bc470b3c1d7c3b9d2 |
Hashes for rustfst_python-0.11.2-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4cc73f244aee4efad6e37abe9b32076bc0006009fba6859b9c96db4655b187d3 |
|
MD5 | 8b68e87108e6f9d96dae68fbf4b46839 |
|
BLAKE2b-256 | 7027f9adcc97295706cb35fee2dd07a311a3118631a5d12a5077c947e91fb26c |
Hashes for rustfst_python-0.11.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9949399536600aeb34edb6969e3ae785ac86d7715ff897c67348b89e4cf67830 |
|
MD5 | 73b5b74e58ea2d74e50ceee89b0ed9a8 |
|
BLAKE2b-256 | 439d8f272a70b1a1fec5cf1bed7057dc0e7c49e4a7412e24c3452b99bb6af85c |
Hashes for rustfst_python-0.11.2-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 503d839bbe35ed4a72d7777720fd9152ec9502984469559282be594511d587ee |
|
MD5 | ed5c691c7e4ff68a485c3dcf5378c0e0 |
|
BLAKE2b-256 | 6cac08272101d623978f8ca129fbf9f78e72897d34ebc38cdc96a7b73b12965f |
Hashes for rustfst_python-0.11.2-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25ac21fed1e4cd432603b1a33a63c86bce7a149e20f089d9ced2b96757a148bb |
|
MD5 | 7fafb92b3d20b21a0daaf401549fd0dc |
|
BLAKE2b-256 | 72eaeb16c5386192c5a9cb75e520ceca9c725db3da8699cb24f3d619e995e660 |