Skip to main content

Transition-based UCCA Parser

Project description

TUPA is a transition-based parser for Universal Conceptual Cognitive Annotation (UCCA).

Requirements

Build

Create a virtual environment:

virtualenv --python=/usr/bin/python3 venv
. venv/bin/activate              # on bash
source venv/bin/activate.csh     # on csh

Install:

python setup.py install

Train the parser

Having a directory with UCCA passage files (for example, the Wiki corpus), run:

python tupa/parse.py -t <train_dir> -d <dev_dir> -c <model_type> -m <model_filename>

The possible model types are sparse, mlp and bilstm.

Parse a text file

Run the parser on a text file (here named example.txt) using a trained model:

python tupa/parse.py example.txt -c <model_type> -m <model_filename>

An xml file will be created per passage (separate by blank lines in the text file).

Pre-trained models

To download and extract the pre-trained models, run:

curl --remote-name-all http://www.cs.huji.ac.il/~danielh/ucca/{sparse,mlp,bilstm}.tar.gz
tar xvzf sparse.tar.gz
tar xvzf mlp.tar.gz
tar xvzf bilstm.tar.gz

Run the parser using any of them:

python tupa/parse.py example.txt -c sparse -m models/sparse
python tupa/parse.py example.txt -c mlp -m models/mlp
python tupa/parse.py example.txt -c bilstm -m models/bilstm

Author

Citation

If you make use of this software, please cite the following paper:

@inproceedings{hershcovich2017a,
  title={A Transition-Based Directed Acyclic Graph Parser for {UCCA}},
  author={Hershcovich, Daniel and Abend, Omri and Rappoport, Ari},
  booktitle={Proc. of ACL},
  year={2017}
}

The version of the parser used in the paper is v1.0. To reproduce the experiments from the paper, run in an empty directory (with a new virtualenv):

pip install "tupa>=1.0,<1.1"
mkdir pickle models
curl -L http://www.cs.huji.ac.il/~danielh/ucca/ucca_corpus_pickle.tgz | tar xz -C pickle
curl --remote-name-all http://www.cs.huji.ac.il/~danielh/ucca/{sparse,mlp,bilstm}.tgz
tar xvzf sparse.tgz
tar xvzf mlp.tgz
tar xvzf bilstm.tgz
python -m spacy download en
python -m scripts.split_corpus pickle -t 4282 -d 454 -l
python -m tupa.parse -c sparse -m models/ucca-sparse -Web pickle/test
python -m tupa.parse -c mlp -m models/ucca-mlp -Web pickle/test
python -m tupa.parse -c bilstm -m models/ucca-bilstm -Web pickle/test

License

This package is licensed under the GPLv3 or later license (see `LICENSE.txt <LICENSE.txt>`__).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

TUPA-1.1.0.tar.gz (123.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page