{"id":3329,"library":"databricks-dlt","title":"Databricks Delta Live Tables (DLT) Python Stubs","description":"The `databricks-dlt` library provides Python stubs to facilitate local development of Databricks Delta Live Tables (DLT) pipelines. It offers API specifications, docstring references for IDE autocompletion, and Python data type hints for static type checking. This library is purely for development-time assistance and *does not contain functional implementations*; DLT pipelines must be executed on a Databricks workspace. The underlying DLT product has recently been updated to Lakeflow Spark Declarative Pipelines (SDP) by Databricks, with continuous updates on the platform.","status":"active","version":"0.3.0","language":"en","source_language":"en","source_url":"https://pypi.org/project/databricks-dlt/","tags":["databricks","dlt","delta live tables","etl","data engineering","stub","ide","pyspark"],"install":[{"cmd":"pip install databricks-dlt","lang":"bash","label":"Install the DLT stub library"}],"dependencies":[],"imports":[{"note":"While 'import dlt' is correct for existing DLT code, Databricks now recommends 'from pyspark import pipelines as dp' for new Lakeflow Spark Declarative Pipelines (SDP) development.","wrong":"from dlt import table, read_json","symbol":"dlt","correct":"import dlt"},{"note":"This is the recommended import path for new Lakeflow Spark Declarative Pipelines (SDP) development, replacing the 'dlt' module.","symbol":"pipelines (as dp)","correct":"from pyspark import pipelines as dp"}],"quickstart":{"code":"# This code snippet is designed to run within a Databricks DLT Notebook environment.\n# The 'databricks-dlt' library provides stubs for local development, \n# but actual execution requires a Databricks workspace.\n\nimport dlt\nfrom pyspark.sql.functions import *\n\n# Define a streaming table (Bronze layer)\n@dlt.table\ndef raw_data():\n    # In a real scenario, this would read from a source like Auto Loader\n    # e.g., spark.readStream.format('cloudFiles').option('cloudFiles.format', 'json').load('/databricks-datasets/retail-org/sales_orders/')\n    # For demonstration, we'll simulate a static DataFrame read as this is a stub example\n    return spark.read.format('json').load('/databricks-datasets/retail-org/sales_orders/')\n\n# Define a cleansed table (Silver layer) with expectations\n@dlt.table(comment='Cleansed sales orders with valid order numbers')\n@dlt.expect_or_drop('valid_order_number', 'order_number IS NOT NULL')\ndef cleansed_data():\n    return dlt.read('raw_data').select(col('customer_id'), col('order_number'), col('order_date'))\n\n# Define an aggregated table (Gold layer)\n@dlt.table(name='daily_sales_summary')\ndef daily_sales():\n    return (\n        dlt.read('cleansed_data')\n        .groupBy('order_date')\n        .agg(count('order_number').alias('total_orders'))\n    )\n","lang":"python","description":"This quickstart demonstrates a typical Delta Live Tables pipeline structure using Python decorators within a Databricks notebook. It outlines a medallion architecture (bronze, silver, gold) for data processing. Note that while the `databricks-dlt` library provides IDE support for such code, actual execution and data processing occur only when deployed and run as a DLT pipeline on a Databricks workspace."},"warnings":[{"fix":"Understand that `databricks-dlt` is a development-time aid. To run your pipelines, you need a Databricks workspace and the DLT runtime. Refer to Databricks documentation for deploying and running DLT pipelines.","message":"The `databricks-dlt` PyPI library provides Python *stubs* for local development tools (IDE autocompletion, type checking) and *does not contain functional implementations*. DLT pipelines defined using this stub must be deployed and executed on a Databricks workspace; they cannot be run locally.","severity":"gotcha","affected_versions":"All versions of `databricks-dlt`."},{"fix":"For new DLT/SDP development, use `from pyspark import pipelines as dp` at the top of your Python pipeline files. Replace `@dlt.table` with `@dp.table`, `@dlt.streaming_table` with `@dp.create_streaming_table`, `@dlt.materialized_view` with `@dp.materialized_view`, etc.","message":"The underlying product \"Delta Live Tables (DLT)\" has been rebranded to \"Lakeflow Spark Declarative Pipelines (SDP)\". While existing Python code using `import dlt` will continue to function, Databricks officially recommends migrating new development to use `from pyspark import pipelines as dp` and the corresponding `@dp` decorators and functions for future compatibility and to leverage new features.","severity":"deprecated","affected_versions":"From Databricks Runtime 15.4+ (January 2026 onwards)."},{"fix":"Ensure your Python code is part of a DLT/SDP pipeline definition within a Databricks workspace. Local development with the `databricks-dlt` stub is for syntax validation only, not execution.","message":"The `dlt` (or `pyspark.pipelines`) module and its decorators are only available when your Python code is executed within the context of a Databricks DLT/SDP pipeline. Attempting to import or use these modules in a standalone Python script or a regular Databricks notebook (not configured as a DLT pipeline) will result in an `ImportError` or `NameError`.","severity":"gotcha","affected_versions":"All versions related to Databricks DLT/SDP Python API."},{"fix":"Follow Databricks best practices for dependency management: package Python dependencies as a wheel or egg and specify them in the pipeline libraries settings, or use a `requirements.txt` file attached to the pipeline configuration. Avoid dynamic installation methods within the pipeline code.","message":"Managing external Python dependencies within DLT/SDP pipelines directly with `%pip install` or init scripts on the cluster can be problematic due to potential conflicts and maintenance issues. This can lead to unexpected pipeline failures or inconsistent environments.","severity":"gotcha","affected_versions":"All versions."}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}