Skip to main content

A list of similar sounding words to help disambiguate voice coding

Project description

Similar sounding words This is a list of similar sounding words that I have collected from various sources on the web and added to as I find new pairs.

Unlike most homophone, homograph, and homonym resources this list is not targeting ESL or educational use. Instead it is designed for finding common errors in speech recognition texts. Specifically I use it with Caster for voice programming.

I currently have five different sources. I've downloaded their contents into the data directory as text files, or in one case HTML and parsed appropriately. I have also linked to the original location of these files both inside the files and in the headings between Jupyter cells in the notebook.

Unfortunately I wasn't thinking about reproducibility when I started this project, so most of the text files have had a bit of light preprocessing in a text editor.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

similar-sounding-words-0.0.1.tar.gz (11.3 kB view hashes)

Uploaded Source

Built Distribution

similar_sounding_words-0.0.1-py3-none-any.whl (12.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page