Skip to main content

spaCy pipeline component for CRF entity extraction

Project description

spacy_crfsuite: CRF entity tagger for spaCy.

✨ Features

  • spaCy NER component for Conditional Random Field entity extraction (via sklearn-crfsuite).
  • train & eval command line and example notebook.
  • supports JSON, CoNLL and Markdown annotations

Installation

Python

pip install spacy_crfsuite

🚀 Quickstart

Usage as a spaCy pipeline component

spaCy pipeline

import spacy

from spacy_crfsuite import CRFEntityExtractor

nlp = spacy.blank('en')
pipe = CRFEntityExtractor(nlp).from_disk("model.pkl")
nlp.add_pipe(pipe)

doc = nlp("show mexican restaurents up north")
for ent in doc.ents:
    print(ent.text, "--", ent.label_)

# Output:
# mexican -- cuisine
# north -- location

Follow this example notebook to train the CRF entity tagger from few restaurant search examples.

Train & evaluate CRF entity tagger

Set up configuration file

$ cat << EOF > config.json
{"c1": 0.03, "c2": 0.06}
EOF

Run training

$ python -m spacy_crfsuite.train examples/example.md -o model/ -c config.json
ℹ Loading config: config.json
ℹ Training CRF entity tagger with 15 examples.
ℹ Saving model to disk
✔ Successfully saved model to file.
/Users/talmago/git/spacy_crfsuite/model/model.pkl

Evaluate on a dataset

$ python -m spacy_crfsuite.eval examples/example.md -m model/model.pkl
ℹ Loading model from file
model/model.pkl
✔ Successfully loaded CRF tagger
<spacy_crfsuite.crf_extractor.CRFExtractor object at 0x126e5f438>
ℹ Loading dev dataset from file
examples/example.md
✔ Successfully loaded 15 dev examples.
⚠ f1 score: 1.0
              precision    recall  f1-score   support

           -      1.000     1.000     1.000         2
   B-cuisine      1.000     1.000     1.000         1
   L-cuisine      1.000     1.000     1.000         1
   U-cuisine      1.000     1.000     1.000         5
  U-location      1.000     1.000     1.000         2

   micro avg      1.000     1.000     1.000        11
   macro avg      1.000     1.000     1.000        11
weighted avg      1.000     1.000     1.000        11

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy_crfsuite-1.0.1.tar.gz (15.1 kB view hashes)

Uploaded Source

Built Distribution

spacy_crfsuite-1.0.1-py3-none-any.whl (17.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page