Project description

Wrangl

Parallel data preprocessing for NLP and ML. See docs here. If you find this work helpful, please consider citing

@misc{zhong2021wrangl,
  author = {Zhong, Victor},
  title = {Wrangl: Parallel data preprocessing for NLP and ML},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/vzhong/wrangl}}
}

The supervised learning dataset parallelization component of this library uses Ray. The reinforcement learning environment parallelization component of this library uses Torchbeast.

Installation

pip install -e .  # add [dev] if you want to run tests and build docs.

# for latest
pip install git+https://github.com/vzhong/wrangl

# pypi release
pip install wrangl

Usage

See examples for usage. Here are some common use cases:

process data in parallel
- repeat string
- parse text using Stanza
train models
- train XOR classifier
- train CartPole using Monobeast

Additional utilities

Annotate data in commandline:

wannotate -h

Run tests

python -m unittest discover tests

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.8

May 9, 2022

0.0.6

Dec 13, 2021

This version

0.0.5

Sep 29, 2021

0.0.4

Sep 26, 2021

0.0.1

Sep 1, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wrangl-0.0.5.tar.gz (25.1 kB view hashes)

Uploaded Sep 29, 2021 Source

Built Distribution

wrangl-0.0.5-py3-none-any.whl (30.5 kB view hashes)

Uploaded Sep 29, 2021 Python 3

Hashes for wrangl-0.0.5.tar.gz

Hashes for wrangl-0.0.5.tar.gz
Algorithm	Hash digest
SHA256	`63c4813aa870a2fa6d58311beb53cebf5fe785c486df7769772cf9021360a7c9`
MD5	`015a9be603a89e3b05a8046a14f08ffd`
BLAKE2b-256	`27b035a2b2ca562ba4e86d481f1f4cd55be61b1ad7e70b640e3e4aa41f315c48`

Hashes for wrangl-0.0.5-py3-none-any.whl

Hashes for wrangl-0.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`88f28d44ed0d84d2a3145cf04d21a6461ddc808d2ebe373e689bd00f7ca4f23e`
MD5	`8eb2c36ea8c120100ab2f63561996105`
BLAKE2b-256	`965c20db5bf8d34ac4db249716e5967b318bf1c560548ab1e1cd63c18e22ec66`