SQLMesh: Next-generation data transformation framework
SQLMesh is a next-generation data transformation framework designed to ship data quickly, efficiently, and without error. It enables data teams to run and deploy data transformations written in SQL or Python with visibility and control, supporting concepts like virtual data environments, automated testing, and CI/CD. It is backwards compatible with dbt and focuses on semantic understanding of SQL. The project maintains an active development cycle with frequent releases, with 0.234.0 being the latest stable version.
Warnings
- breaking Frequent updates to the underlying `sqlglot` library (as seen in chore commits like 'bump sqlglot to v30.4.2') can introduce subtle breaking changes or altered SQL parsing/transpilation behavior. While SQLMesh aims to abstract this, it's crucial to review release notes for `sqlglot` when upgrading SQLMesh.
- gotcha SQLMesh previously had compatibility issues with `pandas` version 3, requiring users to explicitly exclude it. While this might be resolved in future versions, `pandas` compatibility can be a recurring issue with rapidly evolving libraries.
- gotcha SQLMesh is inherently stateful. While it can use local databases like DuckDB for development state, production deployments should use robust OLTP databases (e.g., PostgreSQL, MySQL) to store state for ACID transactions, performance, and reliability. Using a local file-based database for shared production state can lead to data integrity issues.
- gotcha The `sqlmesh plan` command defaults to targeting the `prod` environment. Accidentally running `sqlmesh plan` without specifying a development environment can inadvertently target or interfere with your production data if not careful.
- gotcha SQLMesh infers schema information from your models. When selecting from external data sources (tables not defined within your SQLMesh project), it may issue warnings about missing schema knowledge, impacting lineage and optimization capabilities.
- gotcha While SQLMesh automatically infers column names and types, explicit type casting and column definition in SQL models (e.g., `SELECT o.customer_id::TEXT`) is a best practice. Relying solely on `SELECT *` can lead to unexpected schema changes or hinder SQLMesh's ability to precisely detect breaking changes, potentially causing unnecessary recomputations or data issues.
Install
-
pip install sqlmesh -
pip install "sqlmesh[web]" -
pip install "sqlmesh[duckdb]"
Imports
- model
from sqlmesh import model
- ExecutionContext
from sqlmesh import ExecutionContext
- Config
from sqlmesh.core.config import Config
- exp
from sqlglot import exp
Quickstart
import os
# Ensure SQLMesh is installed along with the DuckDB extra
# pip install "sqlmesh[duckdb]"
# Initialize a new SQLMesh project with DuckDB as the engine
# This creates a 'sqlmesh_example' directory with project files
# and sets up a DuckDB connection.
# os.system("sqlmesh init duckdb --path sqlmesh_example")
# Navigate into the project directory (conceptually, in a real script you'd use os.chdir or similar)
# In a terminal, you would run:
# cd sqlmesh_example
# sqlmesh plan
# Simulate a SQL model file: models/full_model.sql
# -- models/full_model.sql
# -- MODEL (
# -- name sqlmesh_example.full_model,
# -- kind FULL
# -- );
# -- SELECT
# -- 1 AS id,
# -- 'A' AS value;
# To run the plan and apply changes (requires a project initialized by 'sqlmesh init'):
# print("\n--- Running sqlmesh plan ---")
# if os.path.exists('sqlmesh_example'):
# os.chdir('sqlmesh_example')
# os.system('sqlmesh plan --no-gaps --auto-apply')
# print("\n--- Querying the created model (requires DuckDB CLI or connector) ---")
# # You can query the model using duckdb CLI if installed:
# # os.system('duckdb sqlmesh.db "SELECT * FROM sqlmesh_example.full_model;"')
# else:
# print("Please run 'sqlmesh init duckdb' in your terminal first to create the example project.")
print("To get started, run 'sqlmesh init duckdb' in your terminal, then 'cd sqlmesh_example' and 'sqlmesh plan'.")
print("This will create an example project and show you the execution plan.")