SQLGlot: SQL Parser, Transpiler, and Optimizer
SQLGlot is a powerful Python library for parsing, transpiling, optimizing, and even executing SQL. It supports translating SQL across over 30 different database dialects, enabling cross-dialect compatibility, query parsing into Abstract Syntax Trees (ASTs), programmatic query rewriting, and optimization. It's known for being a fast, pure-Python solution with no external dependencies and is regularly among the top Python package downloads. The library is currently at version 30.1.0 and follows a versioning strategy where MINOR version increments can introduce backwards-incompatible changes.
Warnings
- gotcha Failing to specify the `read` (source) or `write` (target) dialect during `parse_one` or `transpile` calls can lead to `ParseError` or incorrect output. SQLGlot defaults to its 'SQLGlot dialect', which is a superset, if not specified.
- gotcha SQLGlot strictly parses valid SQL. It will raise a `ParseError` if the SQL string contains non-SQL constructs like templating variables (e.g., `{{parameter}}`) or invalid syntax (e.g., unquoted timestamps in some contexts).
- breaking SQLGlot's versioning strategy indicates that MINOR version increments (e.g., 29.x to 30.x) can introduce backwards-incompatible fixes or feature additions. Be mindful of minor version bumps during upgrades.
- gotcha While SQLGlot is comprehensive, transpilation across all possible dialect pairs and inputs is an 'incremental' problem. Some specific dialect conversions may have limitations or ongoing improvements.
- gotcha Parsing SQL queries with column names or identifiers using non-standard quoting (e.g., backticks as often used by LLMs) can lead to `ParseError`.
Install
-
pip install sqlglot -
pip install "sqlglot[c]" -
pip install "sqlglot[rs]"
Imports
- parse_one
from sqlglot import parse_one
- transpile
from sqlglot import transpile
- exp
from sqlglot import exp
Quickstart
from sqlglot import parse_one, transpile
# Parse a SQL query (defaults to SQLGlot dialect if 'read' is not specified)
sql_query = "SELECT id, name FROM users WHERE age > 18"
expression = parse_one(sql_query)
print(f"Parsed Expression: {expression}")
# Transpile from MySQL to BigQuery
mysql_query = "SELECT IFNULL(employee_name, 'Unknown') AS employee_status FROM employees;"
bigquery_query = transpile(mysql_query, read="mysql", write="bigquery")[0]
print(f"Transpiled to BigQuery: {bigquery_query}")
# Further manipulation of the AST is possible with `expression` object
# E.g., print(expression.find_all(exp.Column))