Tree-sitter Grammar for SQL
The `tree-sitter-sql` Python package (version 0.3.11) provides a SQL grammar for `tree-sitter`, an incremental parsing library. It aims to offer a permissive and general SQL syntax parsing experience, with an initial focus on the PostgreSQL dialect. The package facilitates the use of the SQL grammar with `py-tree-sitter`, the official Python bindings for Tree-sitter. This library is actively maintained, with updates typically coinciding with grammar enhancements and core `tree-sitter` library developments.
Warnings
- breaking Grammar definitions (node names, structure) in `tree-sitter-sql` can evolve. Updates to the grammar may introduce breaking changes to existing Tree-sitter queries that rely on specific node types or structures, requiring query adjustments.
- breaking The underlying `tree-sitter` Python library (on which `tree-sitter-sql` depends) may introduce breaking changes. For example, the behavior of `iter_matches` was updated to fix incorrect behavior, requiring changes for code relying on the old behavior.
- gotcha The `tree-sitter-sql` grammar aims for permissiveness and initially focuses on the PostgreSQL dialect. While general, it may not strictly conform to all nuances of other specific SQL dialects (e.g., SQLite, MySQL, BigQuery).
Install
-
pip install tree-sitter tree-sitter-sql
Imports
- Language
from tree_sitter import Language
- Parser
from tree_sitter import Parser
- tree_sitter_sql
import tree_sitter_sql
Quickstart
import tree_sitter_sql
from tree_sitter import Language, Parser
# Initialize the SQL language from the installed grammar package
SQL_LANGUAGE = Language(tree_sitter_sql.language())
# Create a parser and set its language
parser = Parser()
parser.set_language(SQL_LANGUAGE)
# SQL code to parse (must be bytes)
sql_code = b"""
SELECT id, name FROM users WHERE age > 30 ORDER BY name ASC;
"""
# Parse the SQL code
tree = parser.parse(sql_code)
# Get the root node of the syntax tree
root_node = tree.root_node
# Print a basic representation of the tree (for demonstration)
# In a real application, you would traverse the tree or use queries.
print(f"Root Node Type: {root_node.type}")
print(f"Root Node Text: {root_node.text.decode('utf8')}")
# Example of finding a specific node type (e.g., 'select_statement')
# This is a basic traversal; for complex patterns, Tree-sitter queries are used.
select_statement_node = None
for child in root_node.children:
if child.type == 'select_statement':
select_statement_node = child
break
if select_statement_node:
print(f"Found 'select_statement' node. Text: {select_statement_node.text.decode('utf8')}")
else:
print("No 'select_statement' node found.")