Python grammar for tree-sitter
tree-sitter-python (version 0.25.0) provides a pre-compiled Python grammar for the tree-sitter parsing library. It enables Python applications to efficiently parse Python source code into concrete syntax trees, facilitating tasks like static analysis, code transformation, and IDE features. The library is actively maintained with regular updates following the upstream tree-sitter project.
Warnings
- breaking The `tree-sitter` core library and its language grammars (like `tree-sitter-python`) use an internal ABI version. Incompatibilities can arise if the `tree-sitter` Python package and the `tree-sitter-python` grammar package are not aligned in their supported ABI versions, leading to runtime errors.
- breaking The `tree-sitter` core Python bindings (`py-tree-sitter` package) introduced significant breaking changes in version `0.24.0`, affecting method signatures for `Parser.parse`, `Query.captures`, `Query.matches`, and deprecated `Language.query`.
- deprecated Version 0.24.0 of `tree-sitter-python` dropped compatibility with Python 3.9.
- gotcha When working with `tree-sitter` and its Python bindings, parsing very large or complex codebases, especially with certain grammar versions (e.g., specific `tree-sitter-bash` versions), has been reported to cause high memory usage or even memory leaks.
- gotcha While `tree-sitter-python` provides a pre-compiled grammar, building custom or third-party `tree-sitter` grammars for other languages requires a C compiler and Node.js for compilation, which can be a common setup hurdle for new users.
Install
-
pip install tree-sitter tree-sitter-python
Imports
- Language
from tree_sitter import Language
- Parser
from tree_sitter import Parser
- tree_sitter_python
import tree_sitter_python
Quickstart
from tree_sitter import Language, Parser
import tree_sitter_python
# Initialize the Python language grammar
PY_LANGUAGE = Language(tree_sitter_python.language())
# Create a parser and configure it to use the Python language
parser = Parser(PY_LANGUAGE)
# Source code to parse
code = b"""
def greet(name):
print(f"Hello, {name}!")
greet("World")
"""
# Parse the code
tree = parser.parse(code)
# Get the root node of the syntax tree
root_node = tree.root_node
# Print the type of the root node and its first child (for demonstration)
print(f"Root node type: {root_node.type}")
if root_node.children:
print(f"First child type: {root_node.children[0].type}")
# Expected output might vary slightly but should show 'module' and 'function_definition'