Monte Carlo Data Airflow Provider

0.3.11 · active · verified Thu Apr 16

Monte Carlo's Apache Airflow Provider integrates Monte Carlo with Airflow to enable data observability features like incident alerts, lineage visibility, and pipeline control via Circuit Breakers. It is compatible with Apache Airflow 1.10.14 or greater and requires Python 3.7 or greater. The library is actively maintained with regular updates.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to integrate Monte Carlo callbacks into an Apache Airflow DAG. The `mcd_callbacks.dag_callbacks` and `mcd_callbacks.task_callbacks` are used to automatically send webhooks to Monte Carlo upon various DAG and task events (success, failure, etc.), providing observability. For full functionality, Monte Carlo API credentials must be configured as an Airflow connection.

from __future__ import annotations

import pendulum

from airflow.models.dag import DAG
from airflow.operators.bash import BashOperator

from airflow_mcd.callbacks import mcd_callbacks

with DAG(
    dag_id="monte_carlo_example_dag",
    start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
    catchup=False,
    schedule=None,
    tags=["monte_carlo", "example"],
    **mcd_callbacks.dag_callbacks, # Apply broad DAG-level callbacks
) as dag:
    start_task = BashOperator(
        task_id="start_task",
        bash_command="echo 'Starting DAG'",
        **mcd_callbacks.task_callbacks, # Apply broad Task-level callbacks
    )

    process_data = BashOperator(
        task_id="process_data",
        bash_command="echo 'Processing data...'; sleep 5",
        # You can override specific callbacks if needed:
        # on_failure_callback=mcd_callbacks.mcd_task_failure_callback,
        **mcd_callbacks.task_callbacks,
    )

    end_task = BashOperator(
        task_id="end_task",
        bash_command="echo 'DAG finished'",
        **mcd_callbacks.task_callbacks,
    )

    start_task >> process_data >> end_task

view raw JSON →