A replaCy component to fix issue boundary, fix signle or double space at start, extend to next word if facing casing issue.
Project description
replaCy Issue Boundary
A replaCy component to fix issue boundary:
- fix signle or double space at start.
- fix double comma.
- fix comma at start
- fix parenthesis space.
- fix first letter is a lower case.
- extend to next word if facing casing issue.
Warning
Add after joiner to work
Install
poetry add replacy_issue_boundary
or
pip install replacy_issue_boundary
Usage
import en_core_web_sm
from replacy import ReplaceMatcher
from replacy.db import load_json
from replacy_issue_boundary import IssueBoundary
from spacy.util import filter_spans
nlp = en_core_web_sm.load()
replaCy = ReplaceMatcher(nlp, load_json('path to match dict(s)'))
issue_boundary = IssueBoundary()
replaCy.add_pipe(name="span_filter", component=filter_spans, before="joiner")
replaCy.add_pipe(issue_boundary, name="article_agreer", after="joiner")
Developing
The CI/CD in this project is great. GitHub Actions run linting and tests on any PR. If you merge into master, release-drafter drafts a new release based on PR commits and tags (e.g. if the PR is tagged feature
and minor
it will create a minor version bump with the changes labeled as Features).
I can't figure out the automatic versioning bit... leaving it in a broken state for now :/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for replacy-issue-boundary-0.2.5.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26e44c7786b0670c28cf864979efd0e82d54634db0a4e26d6e34288a4ee9b3f4 |
|
MD5 | c5e8b0a833afad2c8aa88132dcec288b |
|
BLAKE2b-256 | 1e58bc7c03f77938978aac2be1b6a55d56cd8db1e19e088ec7d80be70996bf1d |
Hashes for replacy_issue_boundary-0.2.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c8bcad627ab36409fbff38c6396e38b224691db1394f2f9e4cf1130951e8d64 |
|
MD5 | 3c4351e3ec7ef17b6746d148e9e7bd72 |
|
BLAKE2b-256 | 4f648a51307d9acf0434f6f0d7e1b01dd365e10203897b9def933702f0bfb3d1 |