No project description provided
Project description
Span Extructure
You might think the name is mispelled but it ain't. It is a word play on spaCy's Span
, extract and structure. span_exctructure
is a spaCy component that builds upon SpanRuler
and regex to extract structured information, e.g. dates, amounts with currency and multipliers etc.
Installation
pip install span_extructure
Usage
import spacy
nlp = spacy.blank("en")
# Optionally add config if varying from default values
config = {
"overwrite": False, # default: False
"rules": [
{
"patterns": [[{"SHAPE": "dd.dd.dddd"}]],
"extruct": r"(?P<day>[0-3]\d).(?P<month>0[1-9]|1[0-2]).(?P<year>20[0-5]\d|19\d\d)",
"label": "DATE",
}
]
}
nlp.add_pipe("span_extructure", config=config)
doc = nlp("This date 21.04.1986 will be a DATE entity while the structured information will be extracted to `Span._.extructure`")
for e in doc.ents:
print(f"{e.text}\t{e.label_}\t{e._.extructure}")
>>> 21.04.1986 DATE {'day': '21', 'month': '04', 'year': '1986'}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
span-extructure-0.1.0.tar.gz
(3.6 kB
view hashes)
Built Distribution
Close
Hashes for span_extructure-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7bea14b8c7d93c50d953ef7671d6d94aaeae0321f35269c4d2e576e2a0720ddc |
|
MD5 | 78dd75e57d4f10af5796a4124af64861 |
|
BLAKE2b-256 | 1d077534d1cc9ba26932f8c649e54814ab3667571225f9f7d33a9707e3748784 |