Library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Re-implementation of OpenFst in Rust.
Project description
Rustfst
Rust
Python
This repo contains a Rust implementation of Weighted Finite States Transducers. Along with a Python binding.
Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition's input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.
FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.
References
Implementation heavily inspired from Mehryar Mohri's, Cyril Allauzen's and Michael Riley's work :
- Weighted automata algorithms
- The design principles of a weighted finite-state transducer library
- OpenFst: A general and efficient weighted finite-state transducer library
- Weighted finite-state transducers in speech recognition
Example
use anyhow::Result;
use rustfst::prelude::*;
use rustfst::algorithms::determinize::{DeterminizeType, determinize};
use rustfst::algorithms::rm_epsilon::rm_epsilon;
fn main() -> Result<()> {
// Creates a empty wFST
let mut fst = VectorFst::<TropicalWeight>::new();
// Add some states
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();
// Set s0 as the start state
fst.set_start(s0)?;
// Add a transition from s0 to s1
fst.add_tr(s0, Tr::new(3, 5, 10.0, s1))?;
// Add a transition from s0 to s2
fst.add_tr(s0, Tr::new(5, 7, 18.0, s2))?;
// Set s1 and s2 as final states
fst.set_final(s1, 31.0)?;
fst.set_final(s2, 45.0)?;
// Iter over all the paths in the wFST
for p in fst.paths_iter() {
println!("{:?}", p);
}
// A lot of operations are available to modify/optimize the FST.
// Here are a few examples :
// - Remove useless states.
connect(&mut fst)?;
// - Optimize the FST by merging states with the same behaviour.
minimize(&mut fst)?;
// - Copy all the input labels in the output.
project(&mut fst, ProjectType::ProjectInput);
// - Remove epsilon transitions.
rm_epsilon(&mut fst)?;
// - Compute an equivalent FST but deterministic.
fst = determinize(&fst)?;
Ok(())
}
Benchmark with OpenFST
I did a benchmark some time ago on almost every linear fst algorithm and compared the results with OpenFst
. You can find the results here :
Spoiler alert: Rustfst
is faster on all those algorithms 😅
Documentation
The documentation of the last released version is available here : https://docs.rs/rustfst
Release process
- Use the script
update_version.sh
to update the version of every package. - Push
- Push a new tag with the prefix
rustfst-v
Example :
./update_version.sh 0.9.1-alpha.6
git commit -am "Release 0.9.1-alpha.6"
git push
git tag -a rustfst-v0.9.1-alpha.6 -m "Release rustfst 0.9.1-alpha.6"
git push --tags
Optionally, if this is a major release, create a GitHub release in the UI.
Projects contained in this repository
This repository contains two main projects:
rustfst
is the Rust re-implementation.rustfst-python
is the python binding ofrustfst
.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for rustfst_python-0.10.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c32bb8ca7f9aac3eb1aa56f92406ea766047d8bc0361e84e09368b534b4cfb8 |
|
MD5 | 041815a0b33015267b5bee29c455636b |
|
BLAKE2b-256 | ec07d8f5671039754257c485edf78f5f1351435fa8f15ed3c4b8a025328b0a1d |
Hashes for rustfst_python-0.10.0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 318d2185816a67ef94f8e1956625e9031230f6881658c6de399d8d7ba7fdae3e |
|
MD5 | 7f936be7481d80b698481a9c0c300bdb |
|
BLAKE2b-256 | 653e9837b936df5430a7ac563453e31667844567aa2ac12977454ccae77c49e0 |
Hashes for rustfst_python-0.10.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c919870373795de316315f3840b6a535dd2283d0e6d904a5ee1981dbd3a34d8 |
|
MD5 | ef0abc40476bfe5560a8e7db94d773e9 |
|
BLAKE2b-256 | fe5aa635b836eedbab50ff5ef168b1ed2f035efdb4bd97aa86737795b7dd3785 |
Hashes for rustfst_python-0.10.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9ab83c94f4ccb0bc1e6d41fc82f74527c422571ae639f9ab7629c1fd406e55d |
|
MD5 | 611498cb7048a76fd3da3e834afac9fc |
|
BLAKE2b-256 | 8a8bc1fc83f986f8774f6e5bc0b868e5c989aab4d982a404a8f206de65d145c9 |
Hashes for rustfst_python-0.10.0-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83448d2d524b77866fe9e5eee355c31b973224b7126fd2e110c3212a200279af |
|
MD5 | 54940f52793fce18a4b05ea171116792 |
|
BLAKE2b-256 | 54919458a50a5eee18a52b765e7cca8d9767731051c4ca6c4ec3cebb853b8c74 |
Hashes for rustfst_python-0.10.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7ef35e5c30b19f44c3d0a68566ca2ad950129526946213835e9691df39c2fdd |
|
MD5 | 80f2d92db32d86103a03a3d1d3468e9a |
|
BLAKE2b-256 | 9114df08467bc78791c56d845bd6cf775a2816a2678da2773bf6cfcc024facfa |
Hashes for rustfst_python-0.10.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a948beeff97eaeb0d7baf945e50c9cb2022e6f855cb18ce45353701818e1eb07 |
|
MD5 | e870374610e7f75144d65186b747a26c |
|
BLAKE2b-256 | df1236acac576fc744fca2b8f1b02c559bea68449cdb43df86ee9e9f120fd60a |
Hashes for rustfst_python-0.10.0-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f17e497f37d88007425cdf1a86b0fafa79c2b7d526d7dfad451caa7c351f216b |
|
MD5 | a18e388f0570f299a3b88f2e66f8edd4 |
|
BLAKE2b-256 | 31f8ce271be8d1a3c0e7875039f8a938d8396e879f3d1154e5ce31b7d4ffe7db |
Hashes for rustfst_python-0.10.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c65e65efb3eedc273c18ba149025ff7a49d7a133cb5f7c607593ec115d0dfb4 |
|
MD5 | 400d65f3c616b5ab61fc0c2e99f31760 |
|
BLAKE2b-256 | 74ff1cfcadc5ba140b00f8c345fbf5c88469bdda326cea9808a9e2668eba0570 |
Hashes for rustfst_python-0.10.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 589cba5aacac444be663b4a831b346ae54a99dcf1b4fbeb0fc8c238a0af42ebc |
|
MD5 | 9e5f10ca2bef39a4e59c45e8a6c28fc9 |
|
BLAKE2b-256 | 72aa33b0588935bd24bcbeeca5d262ce0eaaaa73513b85b2299fa60935b2c773 |
Hashes for rustfst_python-0.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1372afd78b58c7b0dcf34dd743cb479c2f783d2387cd1c01cc27e1d7ec6c5998 |
|
MD5 | 10ff98bf98d12cc9a3793c9f4c4aa57b |
|
BLAKE2b-256 | a4cf52ae54969688c2be77f8425602031bc1728e278c7d24769526acf9c7d46f |
Hashes for rustfst_python-0.10.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e12a7b87f97e6532ba37dd6ff746b44c6921b197bcc3b5fddad0c9d183ea7779 |
|
MD5 | ba2571287bdc94b84b3ed9cffc97ce07 |
|
BLAKE2b-256 | 3468c1993de3446c9728b3c7f29534993374bc3db3bae663d02534f39cfb24c0 |
Hashes for rustfst_python-0.10.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f679f435e6110045e700b030e0d8dd0edc14df26c8527bdec63c8a86095e930c |
|
MD5 | 0d3924cab144904b9d7855a97b6c9a4b |
|
BLAKE2b-256 | a7834b3a8b021a5b63cfe93b820182e8e400f255786c116ea9c4a7b2efeec2cd |
Hashes for rustfst_python-0.10.0-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fd59239e2b5171ffaafcadd1f35e88ebd16a0edefa3ee9f956d1328dab2d075 |
|
MD5 | b240389fbd910524deea163d8b4c94e3 |
|
BLAKE2b-256 | 9206636c139107616d0b0af1a665427f96cbaeae618ea89a3201f71a16cd6968 |
Hashes for rustfst_python-0.10.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ae65b1341183ba16a7673ae7045466924e74a169fc21c59319d1cc593512880 |
|
MD5 | 10ab75028c4590d47fc0ab44ca8a4096 |
|
BLAKE2b-256 | 50bff1a6da60ab59905db79a4045a78e31cff2b09096b59af29691c0d27780de |
Hashes for rustfst_python-0.10.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9438c334a265e9aa0f15c9c664f63da7d38a6e9bb1daf7b2f38681bc590855c |
|
MD5 | 6a1beaf4f7374255d24f28d2fdf0f7d5 |
|
BLAKE2b-256 | 4a945994b37f521b2580a493a1b7291f35c63e79233ae7dbea231d5e891a3969 |
Hashes for rustfst_python-0.10.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0301c2d185f7e02b8b7677d42180007a51dfc1f51991510d4332b2bff538c711 |
|
MD5 | e789d11cea8e52e917b5acb9285a45a3 |
|
BLAKE2b-256 | 21473cb95ada4b85347c9236b58b4df88584f5503b7f9c0bdd22778ecb0666fa |
Hashes for rustfst_python-0.10.0-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b1788d88b95e9b81677207766ded047f012b91808d981b2bfcc238262153b1d |
|
MD5 | d9218404bcb4608dfdf2d04dc2adf28a |
|
BLAKE2b-256 | 8c01119f805f99c667dfe85a7a4520514b059e3293bbd1ca71a3dcc343da21b9 |
Hashes for rustfst_python-0.10.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f23e5e421c6a629df1c310443c235877affa3fcc4e77e6c91809f93df3916775 |
|
MD5 | e72f0d184625a8a9c9eece2995d033f4 |
|
BLAKE2b-256 | 474ecda11e67def089efc097b5746185f1e33ef9451ff8d8bd1abb812c95e9a3 |
Hashes for rustfst_python-0.10.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c31eb706ede1491dd147de4a8170cdb7512963cb1aafc1607248c3d61b017c80 |
|
MD5 | d4b78f701a01d9316a8d6d788cd45ca5 |
|
BLAKE2b-256 | 147e3127826aa0d9a4ccce084569ad2d304984abbc870e838682c688c581c7db |
Hashes for rustfst_python-0.10.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 696b32573f34f9478f2a6447607813b0c851001397677027b207389a1e479380 |
|
MD5 | de2cc87c60c2a113a97e0ace77c36161 |
|
BLAKE2b-256 | 05ed370af82d17ba0dfb72b0ab547893cbfde06b2b67ea6a9e1be92db5b54493 |
Hashes for rustfst_python-0.10.0-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e9aa325537e9c0fad7e775716982a6ca5f7061959a7a7f27b0e57e64052ec49 |
|
MD5 | cfba587bbe29724ff4faf74608ce64d1 |
|
BLAKE2b-256 | 94037af530c063dd40cc72ec22911f89e06d5563ed10795552353c545cea3ea6 |
Hashes for rustfst_python-0.10.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c0c7c334159da870d1b4b85810c29f356b68389787e189ba32ded29fb6e45b4 |
|
MD5 | 56eb5fb35a8c08792ef3269efedea8a8 |
|
BLAKE2b-256 | ec690d33184f7c5d343b88976924c14701cf6440991f178a3062527f414696a4 |
Hashes for rustfst_python-0.10.0-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a68291ca97ac12e7f441bb9546545454ce67e08a0df344394dd3e9ce15ab3913 |
|
MD5 | e64d91c0bf1f6abf461201431891b969 |
|
BLAKE2b-256 | 1337117b707809176c8533650ac7c06ce27b50ccca5773e91b7e25d9190086ff |
Hashes for rustfst_python-0.10.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | df12a7c442ad92c0179233741d509b446cd657e1b4cf2db2d8ca4096c6a699e7 |
|
MD5 | eb29860dcf84f2478a3b03e3dd25b688 |
|
BLAKE2b-256 | 923f1c87976c611615d5870effe7ac2684095ff982a6df321043480a168b816a |