Skip to main content

spaCy pipeline for crfsuite entity extraction

Project description

spacy_crfsuite: crfsuite entity extraction for spaCy.

spacy_crfsuite is an entity extraction pipeline for spaCy based .

Install

Python

pip install spacy_crfsuite

Usage

Spacy usage

import os
import spacy

from spacy_crfsuite import CRFEntityExtractorFactory

# load spacy language model
nlp = spacy.blank('en')

# Will look for ``crf.pkl`` in current working dir
pipe = CRFEntityExtractorFactory(nlp, model_dir=os.getcwd())
nlp.add_pipe(pipe)

# Use CRF to extract entities
doc = nlp("given we launched L&M a couple of years ago")
for ent in doc.ents:
    print(ent.text, "--", ent.label_)

Train a model

python -m spacy_crfsuite.trainer train <TRAIN> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>

Evaluate a model

python -m spacy_crfsuite.trainer eval <DEV> --model-dir <MODEL_DIR> --model-name <MODEL_NAME>

Gold annotations example (markdown)

## Header
- what is my balance <!-- no entity -->
- how much do I have on my [savings](source_account) <!-- entity "source_account" has value "savings" -->
- how much do I have on my [savings account](source_account:savings) <!-- synonyms, method 1-->
- Could I pay in [yen](currency)?  <!-- entity matched by lookup table -->

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy_crfsuite-0.1.0.tar.gz (12.8 kB view hashes)

Uploaded Source

Built Distribution

spacy_crfsuite-0.1.0-py3-none-any.whl (14.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page