Dagster dbt Integration

0.29.0 · active · verified Sat Apr 11

dagster-dbt provides a robust integration for dbt within the Dagster ecosystem. It allows users to define dbt models, seeds, snapshots, and tests as first-class Dagster assets, enabling rich metadata, lineage tracking, and seamless orchestration alongside other data tools. The library is actively developed and typically releases new versions in sync with Dagster core, with the current version being 0.29.0, corresponding to Dagster 1.13.0.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to define Dagster assets from an existing dbt project. It uses `DbtProject` to manage the dbt project and its manifest, and the `@dbt_assets` decorator along with `DbtCliResource` to execute dbt commands and stream events back to Dagster. Ensure you have a valid dbt project structure in the `my_dbt_project` directory relative to this Python file.

from pathlib import Path
from dagster import AssetExecutionContext, Definitions
from dagster_dbt import DbtCliResource, DbtProject, dbt_assets

# Assuming your dbt project is in a subdirectory named 'my_dbt_project'
dbt_project_dir = Path(__file__).parent / "my_dbt_project"

# Initialize DbtProject, which handles manifest compilation
# For dev, prepare_if_dev() compiles the manifest if it's missing or outdated
dbt_project = DbtProject(project_dir=dbt_project_dir)
dbt_project.prepare_if_dev()

# Define dbt assets using the @dbt_assets decorator
# The manifest path is required to infer assets and their dependencies
@dbt_assets(manifest=dbt_project.manifest_path)
def my_dbt_models(context: AssetExecutionContext, dbt: DbtCliResource):
    # Execute dbt build command and stream events to Dagster
    yield from dbt.cli(["build"], context=context).stream()

# Combine assets and resources into a Dagster Definitions object
defs = Definitions(
    assets=[my_dbt_models],
    resources={
        "dbt": DbtCliResource(project_dir=dbt_project),
    },
)

view raw JSON →