{"id":8961,"library":"dlt-meta","title":"DLT-META Framework","description":"DLT-META is a metadata-driven framework for Databricks Lakeflow Declarative Pipelines, designed to automate the creation and management of bronze and silver data pipelines. It leverages metadata defined in JSON or YAML files to dynamically generate pipeline code, streamlining data engineering workflows. The library is currently at version 0.0.10 and has active, though irregular, release cycles with consistent updates.","status":"active","version":"0.0.10","language":"en","source_language":"en","source_url":"https://github.com/databrickslabs/dlt-meta","tags":["databricks","delta live tables","dlt","metadata-driven","etl","data pipelines","automation","lakeflow"],"install":[{"cmd":"pip install dlt-meta","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Required runtime environment.","package":"python","optional":false},{"reason":"Required for CLI interactions and deployment to Databricks workspace (v0.213 or later).","package":"databricks-cli","optional":false},{"reason":"Used for parsing YAML metadata files.","package":"PyYAML","optional":true},{"reason":"Core dependency for Python package management.","package":"setuptools","optional":true},{"reason":"Used for interacting with Databricks APIs.","package":"databricks-sdk","optional":true}],"imports":[{"note":"The `DataflowPipeline` class is exposed directly under the `dlt_meta` package, not a `src` submodule in typical installations.","wrong":"from dlt_meta.src.dataflow_pipeline import DataflowPipeline","symbol":"DataflowPipeline","correct":"from dlt_meta import DataflowPipeline"}],"quickstart":{"code":"# This code typically runs within a Databricks Notebook or job after metadata onboarding.\n# Ensure 'dlt-meta' is installed via %pip install dlt-meta in the notebook or as a cluster library.\n\nimport dlt\nfrom dlt_meta import DataflowPipeline\nimport os\n\n# These parameters would typically be passed as job parameters in Databricks\n# For local testing, you might set environment variables or hardcode.\nlayer = os.environ.get('DLT_META_LAYER', 'bronze').lower() # e.g., 'bronze' or 'silver'\nenv = os.environ.get('DLT_META_ENV', 'dev').lower() # e.g., 'dev', 'qa', 'prod'\n\n# In a Databricks environment, 'spark' session is implicitly available.\n# For local testing outside Databricks, you would need to initialize a SparkSession.\n# Example placeholder for local SparkSession (not typically done in DLT-META's primary use-case):\n# from pyspark.sql import SparkSession\n# spark = SparkSession.builder.appName(\"dlt-meta-local\").getOrCreate()\n\ntry:\n    print(f\"Attempting to invoke DLT-META for layer: {layer} (env: {env}).\")\n    # The 'spark' object is expected to be the Databricks SparkSession\n    DataflowPipeline.invoke_dlt_pipeline(spark=spark, layer=layer, env=env)\n    print(f\"DLT-META successfully invoked for layer: {layer} (env: {env}).\")\nexcept ImportError:\n    print(\"ERROR: Could not import DataflowPipeline from dlt_meta. Ensure the 'dlt-meta' library is installed and available.\")\n    raise\nexcept Exception as e:\n    print(f\"ERROR: An exception occurred during DLT-META pipeline invocation for layer '{layer}' in env '{env}': {e}\")\n    raise\n","lang":"python","description":"This quickstart demonstrates how to programmatically invoke the `dlt-meta` framework within a Databricks environment (typically a notebook or job). It assumes `dlt-meta` is installed and metadata has been onboarded. The `DataflowPipeline.invoke_dlt_pipeline` method orchestrates the creation and execution of DLT pipelines based on the provided layer and environment, reading from pre-configured metadata."},"warnings":[{"fix":"Migrate existing pipelines from DPM mode to the default publishing mode as per Databricks' migration guide.","message":"The DPM (Direct Publishing Mode) flag was removed in v0.0.10. Pipelines using DPM mode in v0.0.9 must be migrated to the default publishing mode before upgrading. This change is metadata-only but irreversible.","severity":"breaking","affected_versions":">=0.0.10"},{"fix":"Remove database qualifiers from table names in your metadata (e.g., change `database.schema.table` to `schema.table` or `table`).","message":"Multi-Level Namespace Changes in v0.0.10. Custom schema qualification in table names is no longer supported; tables must be created without database qualifiers.","severity":"breaking","affected_versions":">=0.0.10"},{"fix":"Update existing pipeline configurations to use the new layer-specific argument prefixes (e.g., `bronze_arg` or `silver_arg`).","message":"Argument changes for `invoke_dlt_pipeline` in v0.0.10. Method arguments now require `bronze_` or `silver_` prefixes to support `apply_changes_from_snapshot` in both layers.","severity":"breaking","affected_versions":">=0.0.10"},{"fix":"For issues, report them directly on the dlt-meta GitHub repository.","message":"DLT-META is a Databricks Labs project and is provided for exploration only. Databricks does not formally support it or provide SLAs. Do not submit Databricks support tickets for issues; instead, file a GitHub issue.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Validate onboarding JSON/YAML against a predefined schema before ingestion. Ensure all required fields are present and data types are correct.","message":"Malformed JSON/YAML metadata can lead to job failures. The framework relies heavily on correct metadata structure and content.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure `dlt-meta` is installed using `pip install dlt-meta`. In Databricks notebooks, use `%pip install dlt-meta` at the start of the notebook. If running a Databricks job using a Python wheel, ensure `dlt_meta` is specified as a dependent library.","cause":"The `dlt-meta` package is not installed or not accessible in the Python environment where the code is being run. This often happens in Databricks notebooks if `%pip install dlt-meta` wasn't executed, or if running locally without installing the package.","error":"ImportError: Could not import DataflowPipeline from dlt_meta. Ensure the 'dlt-meta' library is installed and available."},{"fix":"Update your metadata files (JSON/YAML) to remove the database qualifier from table names. Tables should be defined without the database prefix, e.g., `schema.table` or just `table` if the schema is implicitly handled.","cause":"This error occurs in dlt-meta v0.0.10 and later due to the 'Multi-Level Namespace Changes' breaking change. Custom schema qualification (e.g., `database.schema.table`) in table names within your metadata is no longer supported.","error":"com.databricks.pipelines.common.errors.DLTAnalysisException: Materializing tables in custom schemas is not supported. Please remove the database qualifier from table 'table_name'."},{"fix":"To address this, if you need these columns, try reading the CDF-enabled table *before* importing any `dlt` or `dlt_meta` modules. Alternatively, if within a DLT pipeline, use the `except_column_list` parameter to explicitly exclude these columns, or ensure reserved column names are not conflicting in your source table's schema.","cause":"When the `dlt` module (which `dlt-meta` internally leverages) is imported, it can alter the default behavior of reading CDF-enabled tables, sometimes preventing the exposure of these reserved metadata columns due to conflicts or internal handling.","error":"CDF metadata columns (_change_type, _commit_version, _commit_timestamp) are lost after importing dlt (or dlt-meta which often uses dlt)."},{"fix":"Enable Delta's `mergeSchema` on writes where appropriate. Implement schema-drift detection jobs to monitor source schema changes and update `dlt-meta` metadata (Dataflowspec) promptly. Validate onboarding JSON/YAML against a predefined schema.","cause":"The pipeline's schema expectations are not aligned with changes in the source data (e.g., columns added, removed, or renamed without corresponding metadata updates).","error":"Pipeline fails due to schema evolution surprises or unexpected column changes in upstream sources."}]}