{"id":6461,"library":"sqlmesh","title":"SQLMesh: Next-generation data transformation framework","description":"SQLMesh is a next-generation data transformation framework designed to ship data quickly, efficiently, and without error. It enables data teams to run and deploy data transformations written in SQL or Python with visibility and control, supporting concepts like virtual data environments, automated testing, and CI/CD. It is backwards compatible with dbt and focuses on semantic understanding of SQL. The project maintains an active development cycle with frequent releases, with 0.234.0 being the latest stable version.","status":"active","version":"0.234.0","language":"en","source_language":"en","source_url":"https://github.com/SQLMesh/sqlmesh","tags":["data transformation","ETL","analytics engineering","SQL","Python","data orchestration","data quality","CI/CD","virtual environments","SQL transpilation"],"install":[{"cmd":"pip install sqlmesh","lang":"bash","label":"Base Installation"},{"cmd":"pip install \"sqlmesh[web]\"","lang":"bash","label":"With UI (Web)"},{"cmd":"pip install \"sqlmesh[duckdb]\"","lang":"bash","label":"With DuckDB connector"}],"dependencies":[{"reason":"Core SQL parsing and transpilation engine.","package":"sqlglot","optional":false},{"reason":"Used for Python models returning DataFrames.","package":"pandas","optional":true},{"reason":"Common local database engine for quickstart and testing.","package":"duckdb","optional":true},{"reason":"PostgreSQL database connector (example, other DBs require their own connectors).","package":"psycopg2-binary","optional":true}],"imports":[{"symbol":"model","correct":"from sqlmesh import model"},{"symbol":"ExecutionContext","correct":"from sqlmesh import ExecutionContext"},{"symbol":"Config","correct":"from sqlmesh.core.config import Config"},{"symbol":"exp","correct":"from sqlglot import exp"}],"quickstart":{"code":"import os\n# Ensure SQLMesh is installed along with the DuckDB extra\n# pip install \"sqlmesh[duckdb]\"\n\n# Initialize a new SQLMesh project with DuckDB as the engine\n# This creates a 'sqlmesh_example' directory with project files\n# and sets up a DuckDB connection.\n# os.system(\"sqlmesh init duckdb --path sqlmesh_example\")\n\n# Navigate into the project directory (conceptually, in a real script you'd use os.chdir or similar)\n# In a terminal, you would run:\n# cd sqlmesh_example\n# sqlmesh plan\n\n# Simulate a SQL model file: models/full_model.sql\n# -- models/full_model.sql\n# -- MODEL (\n# --   name sqlmesh_example.full_model,\n# --   kind FULL\n# -- );\n# -- SELECT\n# --   1 AS id,\n# --   'A' AS value;\n\n# To run the plan and apply changes (requires a project initialized by 'sqlmesh init'):\n# print(\"\\n--- Running sqlmesh plan ---\")\n# if os.path.exists('sqlmesh_example'):\n#     os.chdir('sqlmesh_example')\n#     os.system('sqlmesh plan --no-gaps --auto-apply')\n#     print(\"\\n--- Querying the created model (requires DuckDB CLI or connector) ---\")\n#     # You can query the model using duckdb CLI if installed:\n#     # os.system('duckdb sqlmesh.db \"SELECT * FROM sqlmesh_example.full_model;\"')\n# else:\n#     print(\"Please run 'sqlmesh init duckdb' in your terminal first to create the example project.\")\n\nprint(\"To get started, run 'sqlmesh init duckdb' in your terminal, then 'cd sqlmesh_example' and 'sqlmesh plan'.\")\nprint(\"This will create an example project and show you the execution plan.\")","lang":"python","description":"The most common way to get started with SQLMesh is via its CLI. This quickstart demonstrates how to initialize a new project using DuckDB as the local engine and then run your first plan. The `sqlmesh init duckdb` command scaffolds a project, and `sqlmesh plan` shows the proposed changes. For a fully runnable example demonstrating project setup and execution, refer to the official documentation."},"warnings":[{"fix":"Always check the SQLMesh and SQLGlot changelogs before upgrading. Test your SQLMesh project thoroughly after an upgrade, especially if it involves `sqlglot` bumps, to catch any parsing or transpilation regressions.","message":"Frequent updates to the underlying `sqlglot` library (as seen in chore commits like 'bump sqlglot to v30.4.2') can introduce subtle breaking changes or altered SQL parsing/transpilation behavior. While SQLMesh aims to abstract this, it's crucial to review release notes for `sqlglot` when upgrading SQLMesh.","severity":"breaking","affected_versions":"All versions, due to frequent `sqlglot` updates. Specifically, `v0.234.0` included a `Fix!: Include custom audit args in the metadata hash` that affects metadata hashing."},{"fix":"Ensure your `pandas` version is compatible with your SQLMesh installation. Check the official SQLMesh documentation or GitHub issues for known `pandas` compatibility concerns and required version ranges.","message":"SQLMesh previously had compatibility issues with `pandas` version 3, requiring users to explicitly exclude it. While this might be resolved in future versions, `pandas` compatibility can be a recurring issue with rapidly evolving libraries.","severity":"gotcha","affected_versions":"Reported in `v0.228.5`, which 'Exclude pandas 3'. Check current compatibility notes for `pandas` >= 3.0.0."},{"fix":"Configure a persistent and reliable OLTP database (e.g., PostgreSQL) for SQLMesh state in production environments by specifying `state_connection` in `config.yaml`.","message":"SQLMesh is inherently stateful. While it can use local databases like DuckDB for development state, production deployments should use robust OLTP databases (e.g., PostgreSQL, MySQL) to store state for ACID transactions, performance, and reliability. Using a local file-based database for shared production state can lead to data integrity issues.","severity":"gotcha","affected_versions":"All versions where state management is used."},{"fix":"Always explicitly specify your development environment when running `sqlmesh plan` (e.g., `sqlmesh plan dev_myuser`). Consider changing the default environment in your project's `config.yaml` if all users work in isolated dev environments.","message":"The `sqlmesh plan` command defaults to targeting the `prod` environment. Accidentally running `sqlmesh plan` without specifying a development environment can inadvertently target or interfere with your production data if not careful.","severity":"gotcha","affected_versions":"All versions. Default behavior."},{"fix":"Use `sqlmesh create_external_models` or manually define external models (e.g., in `schema.yml`) to provide SQLMesh with schema information for external data sources. This enables full column-level lineage and better optimization.","message":"SQLMesh infers schema information from your models. When selecting from external data sources (tables not defined within your SQLMesh project), it may issue warnings about missing schema knowledge, impacting lineage and optimization capabilities.","severity":"gotcha","affected_versions":"All versions. Default behavior when external sources are used."},{"fix":"Explicitly define column names and types in your SQL models, particularly in the final `SELECT` statement. Avoid `SELECT *` in production models where schema stability is critical. Manually specify `columns` in the model definition if schema inference is problematic.","message":"While SQLMesh automatically infers column names and types, explicit type casting and column definition in SQL models (e.g., `SELECT o.customer_id::TEXT`) is a best practice. Relying solely on `SELECT *` can lead to unexpected schema changes or hinder SQLMesh's ability to precisely detect breaking changes, potentially causing unnecessary recomputations or data issues.","severity":"gotcha","affected_versions":"All versions."}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z"}