TatSu: PEG Parser Generator
TatSu is a Python library that takes a grammar defined in a variation of EBNF (Extended Backus–Naur Form) and generates a memoizing PEG (Parsing Expression Grammar) / Packrat parser. It is actively maintained with frequent minor releases, providing a powerful tool for creating custom parsers and Abstract Syntax Trees (ASTs). The current version is 5.18.0.
Warnings
- breaking Python 3.10 and 3.11 are no longer officially supported as of v5.16.0. TatSu now requires Python >= 3.12. The GitHub README indicates a preference for Python >= 3.13.
- breaking The `comments_re` and `eol_comments_re` attributes were removed from `ParserConfig` in v5.13.0. Use `comments` and `eol_comments` instead. Additionally, `re.MULTILINE` is no longer enabled by default for comment regexes; users must explicitly add `(?m)` if multi-line matching is needed.
- breaking The `FailedCut` exception and its associated logic were removed in v5.15.0. Code that explicitly catches or relies on this exception will break.
- breaking Generated parsers from older TatSu versions (prior to v5.16.0, and especially prior to v5.0) may not be compatible with newer TatSu runtime libraries. Significant internal refactoring, particularly around `ParserConfig` and AST generation, occurred in recent major versions (e.g., v5.17.0).
- gotcha Rules and closures now return `list` objects instead of `tuple` objects in the generated AST. This changes the structural representation of the AST for rules that produce sequences or repetitions.
Install
-
pip install TatSu
Imports
- compile
from tatsu import compile
- parse
from tatsu import parse
- ParserConfig
from tatsu.parserconfig import ParserConfig
Quickstart
from tatsu import parse
grammar = r'''
@@grammar::Calc
start: expression $ ;
expression: term (('+' | '-') term)* ;
term: factor (('*' | '/') factor)* ;
factor: NUMBER | '(' expression ')' ;
NUMBER: /\d+/ ;
'''
input_text = '1 + 2 * (3 - 4)'
try:
# Parse the input using the grammar
ast = parse(grammar, input_text)
print(f'Input: {input_text}')
print(f'AST: {ast}')
# Example with a simple calculation (requires a semantics class for evaluation)
# For a full calculation example, typically a semantics class is used.
# For this quickstart, we just show the parsing to AST.
except Exception as e:
print(f'Error parsing: {e}')