Inter-rater agreement Phi, as an alternative to Kripperndorfs alpha, as described in https://github.com/AlessandroChecco/agreement-phi

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Information Analysis

Project description

Agreement measure Phi

Source code for inter-rater agreement measure Phi. Live demo here: http://agreement-measure.sheffield.ac.uk

Requirements

python 3+, pymc3 3.3+. See requirements files for tested working versions on linux and osx.

Installation - with pip

Simply run pip install agreement_phi. This will provide a module and a command line executable called run_phi.

Installation - without pip

Download the folder.

Example - from command line

Prepare a csv file (no header, each row is a document, each column a rater), leaving empty the missing values. For example input.csv:

1,2,,3
1,1,2,
4,3,2,1

And execute from the console run_phi --file input.csv --limits 1 4. More details obtained running run_phi --h:

usage: agreement_phi.py [-h] -f FILE [-v] [-l val val]

Phi Agreement Measure

optional arguments:
  -h, --help                     show this help message and exit
  -f FILE, --file FILE           input FILE <REQUIRED>
  -v, --verbose                  print verbose messages
  -l val val, --limits val val   Set limits <RECOMMENDED> (two values separated by a space)

Example - from python

Input is a numpy 2-dimensional array with NaN for missing values, or equivalently a python list of lists (where each list is a set of ratings for a document, of the same length with nan padding as needed). Every row represents a different document, every column a different rating. Note that Phi does not take in account rater bias, so the order in which ratings appear for each document does not matter. For this reasons, missing values and a sparse representation is needed only when documents have different number of ratings.

Input example

import numpy as np
m_random = np.random.randint(5, size=(5, 10)).tolist()
m_random[0][1]=np.nan

or equivalently

m_random = np.random.randint(5, size=(5, 10)).astype(float)
m_random[0][1]=np.nan

Running the measure inference

from agreement_phi import run_phi
run_phi(data=m_random,limits=[0,4],keep_missing=True,fast=True,njobs=4,verbose=False,table=False,N=500)

data [non optional] is the matrix or list of lists of input (all lists of the same length with nan padding if needed).

OPTIONAL PARAMETERS:

limits defines the scale [automatically inferred by default]. It's a list with the minimum and maximum (included) of the scale.
keep_missing [automatically inferred by default based on number of NaNs] boolean. If you have many NaNs you might want to switch to False,
fast [default True] boolean. Whether to use or not the fast inferential technique.
N [default 1000] integer. Number of iterations. Increase it if convergence_test is False.
verbose [default False] boolean. If True it shows more information
table [default False] boolean. If True more verbose output in form of a table.
njobs [default 1] integer. Number of parallel jobs. Set it equal to the number of CPUs available.
binning [default True] boolean. If False consider the values in the boundary of scale non binned: this is useful when using a discrete scale and the value in the boundaries should be considered adhering to the limits and not in the center of the corresponding bin. This is useful when the value of the boundaries have a strong meaning (for example [absolutely not, a bit, medium, totally]) where answering in the boundary of the scale is not in a bin as close as the second step in the scale.

Note that the code will try to infer the limits of the scale, but it's highly suggested to include them (in case some elements on the boundary are missing). For this example the parameter limits would be limits=[0,4].

Note that keep_missing will be automatically inferred, but for highly inbalanced datasets (per document number of ratings distribution) it can be overriden by manually setting this option.

Output example

{'agreement': 0.023088447111559884, 'computation_time': 58.108173847198486, 'convergence_test': True, 'interval': array([-0.03132854,  0.06889001])}

Where 'interval' represents the 95% Highest Posterior Density interval. If convergence_test is False we recommend to increase N.

References

If you use it for academic publications, please cite out paper:

Checco, A., Roitero, A., Maddalena, E., Mizzaro, S., & Demartini, G. (2017). Let’s Agree to Disagree: Fixing Agreement Measures for Crowdsourcing. In Proceedings of the Fifth AAAI Conference on Human Computation and Crowdsourcing (HCOMP-17) (pp. 11-20). AAAI Press.

@inproceedings{checco2017let,
  title={Let’s Agree to Disagree: Fixing Agreement Measures for Crowdsourcing},
  author={Checco, A and Roitero, A and Maddalena, E and Mizzaro, S and Demartini, G},
  booktitle={Proceedings of the Fifth AAAI Conference on Human Computation and Crowdsourcing (HCOMP-17)},
  pages={11--20},
  year={2017},
  organization={AAAI Press}
}

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3
Topic
- Scientific/Engineering :: Information Analysis

Release history Release notifications | RSS feed

This version

0.3.0

Aug 7, 2018

0.2.9

Aug 7, 2018

0.2.8

Aug 7, 2018

0.2.7

Aug 7, 2018

0.2.6

Aug 7, 2018

0.2.5

Aug 7, 2018

0.2.4

Aug 7, 2018

0.2.2

Aug 7, 2018

0.2.1

Aug 7, 2018

0.2.0

Aug 7, 2018

0.1.9

Aug 7, 2018

0.1.8

Aug 7, 2018

0.1.7

Aug 7, 2018

0.1.6

Aug 7, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agreement_phi-0.3.0.tar.gz (9.0 kB view hashes)

Uploaded Aug 7, 2018 Source

Built Distribution

agreement_phi-0.3.0-py3-none-any.whl (7.2 kB view hashes)

Uploaded Aug 7, 2018 Python 3

Hashes for agreement_phi-0.3.0.tar.gz

Hashes for agreement_phi-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`1c3ad841382d64cfcc80e5c6b95871753a9dbc9d7b73ea7abc8eea4e549a72c4`
MD5	`a65f712919b65e5aef94734b1f6660d8`
BLAKE2b-256	`618b2ccd21ce8c42527f19e50e4adfcfb187a805871b618461b014d8886daf33`

Hashes for agreement_phi-0.3.0-py3-none-any.whl

Hashes for agreement_phi-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c8113d1ea7da6d94d4b954efbbe34a6201e8a8f1527770e06084ebc4abd180f3`
MD5	`ec2993ebb090be5c9528b027da86cde8`
BLAKE2b-256	`342a1903fe1f060a161eb9b35b849cb54495b100d12e8e42962e8fb5a2810826`