More SQL Parsing!
mo-sql-parsing is a Python library designed to parse SQL queries into a JSON-izable parse tree. It aims to convert various SQL dialects, initially targeting MySQL, into a structured JSON format, making it easier to translate or analyze SQL for different datastores. As of version 11.697.25301, the library is actively maintained, with ongoing issue resolution and regular updates, as indicated by its high version number and recent activity on GitHub (October 2024 Project Status).
Common errors
-
ModuleNotFoundError: No module named 'moz_sql_parser'
cause The old `moz-sql-parser` library is not installed or the import path is incorrect. `mo-sql-parsing` is its successor.fixInstall `mo-sql-parsing` (`pip install mo-sql-parsing`) and update import statements to `from mo_sql_parsing import parse`. -
Unexpected JSON output for SQL 'NULL' as {'null': {}} instead of None.cause The default behavior of `mo-sql-parsing` is to represent SQL `NULL` as a dictionary `{'null':{}}` in the parse tree.fixTo get Python's `None` for SQL `NULL`, call the `parse` function with the `null` argument: `result = parse(sql, null=None)`. -
SQL queries with square brackets like `SELECT [col] FROM [tbl]` are not parsed correctly; they might be interpreted as array literals.
cause The default parser often assumes a BigQuery-like dialect where `[]` denotes array constructors. This conflicts with SQLServer's use of `[]` for identifiers.fixImport and use the `parse_sqlserver` function explicitly for SQLServer queries: `from mo_sql_parsing import parse_sqlserver as parse`.
Warnings
- breaking The `mo-sql-parsing` library is a fork of the now-archived `moz-sql-parser`. Users migrating from `moz-sql-parser` must update their package installations and import paths (`moz_sql_parser` to `mo_sql_parsing`).
- gotcha Square brackets `[]` have different meanings in SQL dialects (e.g., identifiers in SQLServer, array constructors in BigQuery). Using the default `parse` function for SQLServer queries might lead to incorrect interpretations.
- gotcha By default, SQL `NULL` values are parsed into `{'null':{}}` in the output JSON tree, not Python's `None`.
- gotcha This library relies on `mo-parsing` (a fork of `pyparsing`) internally. There are behavioral differences between `mo-parsing` and upstream `pyparsing`, notably how `add_parse_action()` creates new `ParserElement` objects that must be assigned, and `ParserElements` being static. Direct application of `pyparsing` idioms may not work.
Install
-
pip install mo-sql-parsing
Imports
- parse
from moz_sql_parser import parse
from mo_sql_parsing import parse
- parse_sqlserver
from mo_sql_parsing import parse_sqlserver
- parse_mysql
from mo_sql_parsing import parse_mysql
Quickstart
from mo_sql_parsing import parse
# Basic SELECT query
sql_query = "select count(1) from jobs"
parsed_json = parse(sql_query)
print(f"Parsed basic query: {parsed_json}")
# Expected: {'select': {'value': {'count': 1}}, 'from': 'jobs'}
# SELECT with aliases
sql_query_aliases = "select a as hello, b as world from jobs"
parsed_json_aliases = parse(sql_query_aliases)
print(f"Parsed query with aliases: {parsed_json_aliases}")
# Expected: {'select': [{'value': 'a', 'name': 'hello'}, {'value': 'b', 'name': 'world'}], 'from': 'jobs'}
# Example with SQLServer-specific parsing
from mo_sql_parsing import parse_sqlserver
sqlserver_query = "SELECT [Timestamp] FROM [table]"
parsed_sqlserver = parse_sqlserver(sqlserver_query)
print(f"Parsed SQLServer query: {parsed_sqlserver}")
# Expected: {'select': 'Timestamp', 'from': 'table'}