A python package for benchmarking interpretability approaches.
Project description
A python package for benchmarking interpretability techniques.
Free software: MIT license
Documentation: https://ferret.readthedocs.io.
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from ferret import Benchmark
name = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)
bench = Benchmark(model, tokenizer)
explanations = bench.explain("You look stunning!", target=1)
evaluations = bench.evaluate_explanations(explanations, target=1)
bench.show_evaluation_table(evaluations)
Features
ferret offers a painless integration with Hugging Face models and naming conventions. If you are already using the transformers library, you immediately get access to our Explanation and Evaluation API.
Supported Post-hoc Explainers
Gradient (plain gradients or multiplied by input token embeddings) (Simonyan et al., 2014)
Integrated Gradient (plain gradients or multiplied by input token embeddings) (Sundararajan et al., 2017)
SHAP (via Partition SHAP approximation of Shapley values) (Lundberg and Lee, 2017)
LIME (Ribeiro et al., 2016)
Supported Evaluation Metrics
Faithfulness measures:
AOPC Comprehensiveness (DeYoung et al., 2020)
AOPC Sufficiency (DeYoung et al., 2020)
Kendall’s Tau correlation with Leave-One-Out token removal. (Jain and Wallace, 2019)
Plausibility measures:
Area-Under-Precision-Recall-Curve (soft score) (DeYoung et al., 2020)
Token F1 (hard score) (DeYoung et al., 2020)
Token Intersection Over Union (hard score) (DeYoung et al., 2020)
See our paper for details.
Visualization
The Benchmark class exposes easy-to-use table visualization methods (e.g., within Jupyter Notebooks)
bench = Benchmark(model, tokenizer)
# Pretty-print feature attribution scores by all supported explainers
explanations = bench.explain("You look stunning!")
bench.show_table(explanations)
# Pretty-print all the supported evaluation metrics
evaluations = bench.evaluate_explanations(explanations)
bench.show_evaluation_table(evaluations)
Dataset Evaluations
The Benchmark class has a handy method to compute and average our evaluation metrics across multiple samples from a dataset.
import numpy as np
bench = Benchmark(model, tokenizer)
# Compute and average evaluation scores one of the supported dataset
samples = np.arange(20)
hatexdata = bench.load_dataset("hatexplain")
sample_evaluations = bench.evaluate_samples(hatexdata, samples)
# Pretty-print the results
bench.show_samples_evaluation_table(sample_evaluations)
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
Cookiecutter: https://github.com/audreyr/cookiecutter
audreyr/cookiecutter-pypackage: https://github.com/audreyr/cookiecutter-pypackage
Logo and graphical assets made by Luca Attanasio.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ferret_xai-0.3.5-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3e7b32d2c1385ea42888202cc96d7d79ba0d2500480494b886ae61a3f627d453 |
|
MD5 | 02ee0e63ff6617e7f9a4291e4f4656ac |
|
BLAKE2b-256 | c50018a29ac119cc6e0cf258bfd06f29a9ad69a9beb013c8a8da1b3cb5baceb8 |