Lark Parsing Library
Lark is a modern and comprehensive parsing toolkit for Python, designed to handle any context-free grammar. It offers multiple parsing algorithms (Earley, LALR(1), CYK), EBNF-inspired grammar syntax, and automatically constructs parse trees (ASTs). Known for its ergonomics, performance, and modularity, Lark receives frequent updates and bug fixes.
Warnings
- breaking Lark v1.2.1 dropped support for Python versions lower than 3.8. Ensure your environment is Python 3.8 or newer.
- breaking Lark v1.0 introduced several backward-incompatible changes. Notable changes include installing with `pip install lark` (instead of `lark-parser`), `maybe_placeholders` defaulting to `True`, lexer renames (`TraditionalLexer` to `BasicLexer`), and the signature for `v_args(meta=True)` methods changed to `(meta, children)`.
- gotcha When using the `Earley` parser, an edge case related to `ambiguity='resolve'` was broken in v1.2.1. While fixed in v1.2.2, users of v1.2.1 might encounter incorrect ambiguity resolution.
- gotcha Lark versions prior to 1.1.9 might encounter issues with Python 3.11.7 due to a breaking change in Python's internal `re` module. This was patched in Lark v1.1.9.
- gotcha The `Lark.save()` method is only supported for parsers generated with `parser='lalr'`. Attempting to save an `Earley` parser will result in an error or unexpected behavior.
- gotcha For performance-critical parsing, using `TextSlice` (introduced in v1.3.0) can be faster than string slicing (`s[i:j]`). However, `TextSlice` is only supported when the lexer is set to `'basic'` or `'contextual'`.
Install
-
pip install lark
Imports
- Lark
from lark import Lark
- Transformer
from lark import Transformer
- Tree
from lark import Tree
Quickstart
from lark import Lark, Transformer, v_args
# Define your grammar using EBNF syntax
grammar = '''
?start: expr
?expr: term (("+" | "-") term)*
?term: factor (("*") factor)*
?factor: NUMBER | "(" expr ")"
%import common.NUMBER
%import common.WS
%ignore WS
'''
@v_args(inline=True) # Affects the signatures of the methods
class CalculateTree(Transformer):
from operator import add, sub, mul
number = int
def expr(self, *items):
res = items[0]
for op, val in zip(items[1::2], items[2::2]):
if op == '+': res += val
elif op == '-': res -= val
return res
def term(self, *items):
res = items[0]
for op, val in zip(items[1::2], items[2::2]):
if op == '*': res *= val
return res
# Create a parser instance
math_parser = Lark(grammar, start='expr', parser='lalr', transformer=CalculateTree())
# Parse and evaluate an expression
expression = "(1 + 2) * 3 - 4"
result = math_parser.parse(expression)
print(f"Expression: {expression}")
print(f"Result: {result}")
# Example of getting the parse tree without a transformer
# tree_parser = Lark(grammar, start='expr', parser='lalr')
# tree = tree_parser.parse("1 + 2 * 3")
# print(tree.pretty())