skip to navigation
skip to content

tatsu 4.2.3

TatSu takes a grammar in a variation of EBNF as input, and outputs a memoizing PEG/Packrat parser in Python.

At least for the people who send me mail about a new language that they’re designing, the general advice is: do it to learn about how to write a compiler. Don’t have any expectations that anyone will use it, unless you hook up with some sort of organization in a position to push it hard. It’s a lottery, and some can buy a lot of the tickets. There are plenty of beautiful languages (more beautiful than C) that didn’t catch on. But someone does win the lottery, and doing a language at least teaches you something.

Dennis Ritchie (1941-2011) Creator of the C programming language and of Unix

TatSu

TatSu (the successor to Grako) is a tool that takes grammars in a variation of EBNF as input, and outputs memoizing (Packrat) PEG parsers in Python.

TatSu can compile a grammar stored in a string into a tatsu.grammars.Grammar object that can be used to parse any given input, much like the re module does with regular expressions, or it can generate a Python module that implements the parser.

TatSu fully supports left-recursive rules in PEG grammars using the algorithm by Laurent and Mens. The generated AST has the expected left associativity.

Installation

$ pip install TatSu

Using the Tool

TatSu can be used as a library, much like Python’s re, by embedding grammars as strings and generating grammar models instead of generating Python code.

  • tatsu.compile(grammar, name=None, **kwargs)

    Compiles the grammar and generates a model that can subsequently be used for parsing input with.

  • tatsu.parse(grammar, input, **kwargs)

    Compiles the grammar and parses the given input producing an AST as result. The result is equivalent to calling:

    model = compile(grammar)
    ast = model.parse(input)
    

    Compiled grammars are cached for efficiency.

  • tatsu.to_python_sourcecode(grammar, name=None, filename=None, **kwargs)

    Compiles the grammar to the Python sourcecode that implements the parser.

This is an example of how to use 竜 TatSu as a library:

GRAMMAR = '''
    @@grammar::CALC


    start = expression $ ;


    expression
        =
        | expression '+' term
        | expression '-' term
        | term
        ;


    term
        =
        | term '*' factor
        | term '/' factor
        | factor
        ;


    factor
        =
        | '(' expression ')'
        | number
        ;


    number = /\d+/ ;
'''


if __name__ == '__main__':
    import pprint
    import json
    from tatsu import parse

    ast = parse(GRAMMAR, '3 + 5 * ( 10 - 20 )')
    print('# PPRINT')
    pprint.pprint(ast, indent=2, width=20)
    print()

    print('# JSON')
    print(json.dumps(ast.asjson(), indent=2))
    print()

And this is the output:

PPRINT
[ '3',
  '+',
  [ '5',
    '*',
    [ '10',
      '-',
      '20']]]

JSON
[
  "3",
  "+",
  [
    "5",
    "*",
    [
      "10",
      "-",
      "20"
    ]
  ]
]

License

You may use 竜 TatSu under the terms of the BSD-style license described in the enclosed LICENSE.txt file. If your project requires different licensing please email.

Documentation

For a detailed explanation of what 竜 TatSu is capable off, please see the documentation.

Questions?

For general Q&A, please use the [tatsu] tag on StackOverflow.

Changes

See the CHANGELOG for details.

 
File Type Py Version Uploaded on Size
TatSu-4.2.3-py2.py3-none-any.whl (md5) Python Wheel py2.py3 2017-07-10 81KB
TatSu-4.2.3.zip (md5) Source 2017-07-10 119KB