Skip to main content

A graph-based transliteration tool

Project description

Graph Transliterator

https://img.shields.io/pypi/v/graphtransliterator.svg https://img.shields.io/travis/seanpue/graphtransliterator.svg Documentation Status Updates https://img.shields.io/badge/code%20style-black-000000.svg

A graph-based transliteration tool that lets you convert the symbols of one language or script to those of another using rules that you define.

Features

  • Provides a transliteration tool that can be configured to convert the tokens of an input string into an output string using:

    • user-defined types of input tokens and token classes

    • transliteration rules based on:

      • a sequence of input tokens

      • specific input tokens that precede or follow the token sequence

      • classes of input tokens preceding or following specified tokens

    • “on match” rules for output to be inserted between transliteration rules involving particular token classes

    • defined rules for whitespace, including its optional consolidation

  • Can be setup using:

    • an “easy reading” YAML format that lets you quickly craft settings for the transliteration tool

    • “direct” settings, perhaps passed programmatically, using a dictionary

  • Automatically orders rules by the number of tokens in a transliteration rule

  • Checks for ambiguity in transliteration rules

  • Can provide details about each transliteration rule match

  • Allows optional matching of all possible rules in a particular location

  • Permits pruning of rules with certain productions

  • Can be serialized as a dictionary for export to JSON, etc.

  • Provides full support for Unicode, including Unicode character names in the “easy reading” YAML format

  • Constructs and uses a directed tree and performs a best-first search to find the most specific transliteration rule in a given context

History

[Unreleased - Maybe]

  • Add CLI

  • Add metadata guidelines

  • Save match location in tokenize

  • Reconsider serialization keys

  • Add tests directly to YAML files

  • Allow insertion of transliteration error messages into output.

  • Fix Devanagari output in doc PDF

[Unreleased-TODO]

0.2.13 (2019-08-03)

  • changed setup.cfg for double quotes in bumpversion due to Black formatting of setup.py

  • added version test

0.2.12 (2019-08-03)

  • fixed version error in setup.py

0.2.11 (2019-08-03)

  • travis issue

0.2.10 (2019-08-03)

  • fixed test for version not working on travis

0.2.9 (2019-08-03)

  • Used Black code formatter

  • Adjusted tox.ini, contributing.rst

  • Set development status to Beta in setup.py

  • Added black badge to README.rst

  • Fixed comments and minor changes in initialize.py

0.2.8 (2019-07-30)

  • Fixed ambiguity check if no rules present

  • Updates to README.rst

0.2.7 (2019-07-28)

  • Modified docs/conf.py

  • Modified equation in docs/usage.rst and paper/paper.md to fix doc build

0.2.6 (2019-07-28)

  • Fixes to README.rst, usage.rst, paper.md, and tutorial.rst

  • Modifications to core.py documentation

0.2.5 (2019-07-24)

  • Fixes to HISTORY.rst and README.rst

  • 100% test coverage.

  • Added draft of paper.

  • Added graphtransliterator_version to serialize().

0.2.4 (2019-07-23)

  • minor changes to readme

0.2.3 (2019-07-23)

  • added xenial to travis.yml

0.2.2 (2019-07-23)

  • added CI

0.2.1 (2019-07-23)

  • fixed HISTORY.rst for PyPI

0.2.0 (2019-07-23)

  • Fixed module naming in docs using __module__.

  • Converted DirectedGraph nodes to a list.

  • Added Code of Conduct.

  • Added GraphTransliterator class.

  • Updated module dependencies.

  • Added requirements.txt

  • Added check_settings parameter to skip validating settings.

  • Added tests for ambiguity and check_ambiguity parameter.

  • Changed name to Graph Transliterator in docs.

  • Created core.py, validate.py, process.py, rules.py, initialize.py, exceptions.py, graphs.py

  • Added ignore_errors property and setter for transliteration exceptions (UnrecognizableInputToken, NoMatchingTransliterationRule)

  • Added logging to graphtransliterator

  • Added positive cost function based on number of matched tokens in rule

  • added metadata field

  • added documentation

0.1.1 (2019-05-30)

  • Adjusted copyright in docs.

  • Removed Python 2 support.

0.1.0 (2019-05-30)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphtransliterator-0.2.13.tar.gz (81.8 kB view hashes)

Uploaded Source

Built Distribution

graphtransliterator-0.2.13-py2.py3-none-any.whl (23.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page