Airflow dbt Integration (GoCardless)

0.4.0 · maintenance · verified Tue Apr 14

airflow-dbt is a Python package that provides Apache Airflow operators for integrating with dbt (data build tool). It allows users to orchestrate dbt commands like `seed`, `snapshot`, `run`, and `test` within Airflow DAGs by wrapping the dbt CLI. This package, currently at version 0.4.0, offers a foundational way to embed dbt transformations into Airflow workflows, with its last update in September 2021.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to define a basic Airflow DAG using `airflow-dbt` operators to run a sequence of dbt commands: `seed`, `snapshot`, `run`, and `test`. Ensure your dbt project directory and profiles directory (if not default) are correctly specified, typically via environment variables or directly in the `dir` and `profiles_dir` arguments. The dbt CLI must be installed and accessible on the Airflow worker's PATH.

from airflow import DAG
from airflow_dbt.operators.dbt_operator import (
    DbtSeedOperator,
    DbtSnapshotOperator,
    DbtRunOperator,
    DbtTestOperator
)
from airflow.utils.dates import days_ago
import os

default_args = {
    'dir': os.environ.get('DBT_PROJECT_DIR', '/path/to/your/dbt/project'),
    'start_date': days_ago(0)
}

with DAG(
    dag_id='dbt_example_dag',
    default_args=default_args,
    schedule_interval='@daily',
    tags=['dbt', 'example']
) as dag:
    dbt_seed = DbtSeedOperator(
        task_id='dbt_seed',
        profiles_dir=os.environ.get('DBT_PROFILES_DIR', '/path/to/your/.dbt') # Optional
    )

    dbt_snapshot = DbtSnapshotOperator(
        task_id='dbt_snapshot'
    )

    dbt_run = DbtRunOperator(
        task_id='dbt_run'
    )

    dbt_test = DbtTestOperator(
        task_id='dbt_test',
        retries=0 # Failing tests should fail the task, not retry
    )

    dbt_seed >> dbt_snapshot >> dbt_run >> dbt_test

view raw JSON →