An interpreter for grammar files as defined by TextMate and used in VSCode, implemented in Python. TextMate grammars use the oniguruma dialect (https://github.com/kkos/oniguruma). Supports loading grammar files from JSON, PLIST, or YAML format.
Project description
textmate-grammar-python
An interpreter for grammar files as defined by TextMate and used in VSCode, implemented in Python. TextMate grammars use the oniguruma dialect (https://github.com/kkos/oniguruma). Supports loading grammar files from JSON, PLIST, or YAML format.
Usage
Install the module using pip install textmate-grammar-python
.
Before tokenization is possible, a LanguageParser
needs to be initialized using a loaded grammar.
from textmate_grammar.language import LanguageParser
from textmate_grammar.grammars import matlab
parser = LanguageParser(matlab.GRAMMAR)
After this, one can either choose to call parser.parsing_string
to parse a input string directly, or call parser.parse_file
with the path to the appropiate source file as the first argument, such as the the example example.py
.
The parsed element
object can be displayed directly by calling the print
method. By default the element is printed as an element tree in a dictionary format.
>>> element = parser.parse_string("value = num2str(10);")
>>> element.print()
{'token': 'source.matlab',
'children': [{'token': 'meta.assignment.variable.single.matlab',
'children': [{'token': 'variable.other.readwrite.matlab', 'content': 'value'}]},
{'token': 'keyword.operator.assignment.matlab', 'content': '='},
{'token': 'meta.function-call.parens.matlab',
'begin': [{'token': 'entity.name.function.matlab', 'content': 'num2str'},
{'token': 'punctuation.section.parens.begin.matlab', 'content': '('}],
'end': [{'token': 'punctuation.section.parens.end.matlab', 'content': ')'}],
'children': [{'token': 'constant.numeric.decimal.matlab', 'content': '10'}]},
{'token': 'punctuation.terminator.semicolon.matlab', 'content': ';'}]}
Alternatively, with the keyword argument flatten
the element is displayed as a list per unique token. Here the first item in the list is the starting position (line, column) of the unique tokenized element.
>>> element.print(flatten=True)
[[(0, 0), 'value', ['source.matlab', 'meta.assignment.variable.single.matlab', 'variable.other.readwrite.matlab']],
[(0, 5), ' ', ['source.matlab']],
[(0, 6), '=', ['source.matlab', 'keyword.operator.assignment.matlab']],
[(0, 7), ' ', ['source.matlab']],
[(0, 8), 'num2str', ['source.matlab', 'meta.function-call.parens.matlab', 'entity.name.function.matlab']],
[(0, 15), '(', ['source.matlab', 'meta.function-call.parens.matlab', 'punctuation.section.parens.begin.matlab']],
[(0, 16), '10', ['source.matlab', 'meta.function-call.parens.matlab', 'constant.numeric.decimal.matlab']],
[(0, 18), ')', ['source.matlab', 'meta.function-call.parens.matlab', 'punctuation.section.parens.end.matlab']],
[(0, 19), ';', ['source.matlab', 'punctuation.terminator.semicolon.matlab']]]
Supported Languages
TODO
- Implement Begin/While pattern, required for other grammars.
Sources
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for textmate_grammar_python-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95c11b5d4b00666a0c611e095d00a41809829cb5e3c6136b4cdd48f047419c72 |
|
MD5 | 19503c9ae9c19e0b0a970d59dbea6206 |
|
BLAKE2b-256 | 798b3b4acab0aeb9d5853502880a45860acde1de5a5be4ced2d44c49927091b4 |
Hashes for textmate_grammar_python-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f2ae27dd2904d21dfd22780cdab5fac490fe02db85929b8c7fce467ad179f72 |
|
MD5 | 4dd6c3141c9b0da41adf657d45ad5d0f |
|
BLAKE2b-256 | 2bead31f73fadceb1738fef3abdc4e652232785a2edec8f2cbaba25e7b34c104 |