Evaluation toolkit for neural language generation.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Jury

Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics. Jury offers a smooth and easy-to-use interface. It uses datasets for underlying metric computation, and hence adding custom metric is easy as adopting datasets.Metric.

Main advantages that Jury offers are:

Easy to use for any NLG system.
Calculate many metrics at once.
Metrics calculations are handled concurrently to save processing time.
It supports evaluating multiple predictions.

To see more, check the official Jury blog post.

Installation

Through pip,

pip install jury

or build from source,

git clone https://github.com/obss/jury.git
cd jury
python setup.py install

Usage

API Usage

It is only two lines of code to evaluate generated outputs.

from jury import Jury

jury = Jury()

# Microsoft translator translation for "Yurtta sulh, cihanda sulh." (16.07.2021)
predictions = ["Peace in the dormitory, peace in the world."]
references = ["Peace at home, peace in the world."]
scores = jury.evaluate(predictions, references)

Specify metrics you want to use on instantiation.

jury = Jury(metrics=["bleu", "meteor"])
scores = jury.evaluate(predictions, references)

CLI Usage

You can specify predictions file and references file paths and get the resulting scores. Each line should be paired in both files.

jury eval --predictions /path/to/predictions.txt --references /path/to/references.txt --reduce_fn max

If you want to specify metrics, and do not want to use default, specify it in config file (json) in metrics key.

{
  "predictions": "/path/to/predictions.txt",
  "references": "/path/to/references.txt",
  "reduce_fn": "max",
  "metrics": [
    "bleu",
    "meteor"
  ]
}

Then, you can call jury eval with config argument.

jury eval --config path/to/config.json

Custom Metrics

You can use custom metrics with inheriting jury.metrics.Metric, you can see current metrics on datasets/metrics. The code snippet below gives a brief explanation.

from jury.metrics import Metric

CustomMetric(Metric):
    def compute(self, predictions, references):
        pass

Contributing

PRs are welcomed as always :)

Installation

git clone https://github.com/obss/jury.git
cd jury
pip install -e .[develop]

Tests

To tests simply run.

python tests/run_tests.py

Code Style

To check code style,

python tests/run_code_style.py check

To format codebase,

python tests/run_code_style.py format

License

Licensed under the MIT License.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

2.3

Oct 8, 2023

2.2.4

Jun 15, 2023

2.2.3

Dec 26, 2022

2.2.2

Sep 30, 2022

2.2.1

Sep 21, 2022

2.2

Mar 29, 2022

2.1.5

Dec 23, 2021

2.1.4

Dec 6, 2021

2.1.3

Dec 1, 2021

2.1.2

Nov 14, 2021

2.1.1

Nov 10, 2021

2.1.0

Oct 25, 2021

2.0.0

Oct 11, 2021

1.1.2

Sep 15, 2021

This version

1.1.1

Aug 15, 2021

1.0.1

Aug 13, 2021

1.0.0

Aug 9, 2021

0.0.6

Jul 28, 2021

0.0.5

Jul 27, 2021

0.0.4

Jul 26, 2021

0.0.3

Jul 26, 2021

0.0.2

Jul 14, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jury-1.1.1.tar.gz (14.8 kB view hashes)

Uploaded Aug 15, 2021 Source

Built Distribution

jury-1.1.1-py3-none-any.whl (19.9 kB view hashes)

Uploaded Aug 15, 2021 Python 3

Hashes for jury-1.1.1.tar.gz

Hashes for jury-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`9d3e0a2cec8523d29b1b5667f5076ddacafa6b918a6e963d53b698b15e0edebe`
MD5	`b30dc300bff420c8ad1c60ce1a454f4c`
BLAKE2b-256	`8c3c8b9cd8c43eb2a39859d190324f474c6fc95a317421efb29032b7c18951e7`

Hashes for jury-1.1.1-py3-none-any.whl

Hashes for jury-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1c9d0b2f856de02837900cb21847f8a12abe0faaf4f621841ed44ba7056edc9b`
MD5	`85a95e02e4fe844621472f7321f9c6aa`
BLAKE2b-256	`a17544bbbd00b209efca8dc382f9798bf331e1ab08ad77ee92fc8fa5094ce133`