A new way to encode words and similarity calculate.
Project description
WordFP
This application consists of a python package made to encode words and compare them through similarity calculation. The words are encoded in a matrix of 0's and 1's called "WordFP", where the first column refers to all the letters present in a word and the second column to the last refers to the position of a certain letter in a word. The search for similar words is calculated based on the metrics: geometric, arithmetic, tanimoto and tversky. A jupyter-notebook with an example of using this package is in the examples/how_to_use.ipynb directory
Another way to use this package is through of web app WordFP. It is possible to run locally too following the steps below.
Install
Via pip
$ pip install wordfp
or
Via github
$ git clone https://github.com/jeffrichardchemistry/WordFP
$ cd WordFP
$ python3 setup.py install
Install and Run WebAPP Locally
The web application is in the "app/app.py" folder. Install dependencies:
$ pip install streamlit wordfp
To run:
$ cd .../app/
$ streamlit run app.py
Considerations
This project was an idea I came up with at a random moment while studying my PhD work, I hope it can help someone someday in areas like natural language processing (NLP).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for WordFP-1.0.2.linux-x86_64.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 53de5d2475b20190c9cd0ad16ba8a34cd8cdd3d16f6c741ccf5773e0cfab8fbd |
|
MD5 | 850fc1c334cf62e113f1e9d19dd04a82 |
|
BLAKE2b-256 | 1a39c5e633e3c5ee3591264896830161f7cf33a520c842298df48e5f9cd62913 |