Microsoft Fabric Provider for Apache Airflow

0.0.9 · active · verified Thu Apr 09

A Python package that helps Data and Analytics engineers trigger on-demand job items of Microsoft Fabric in Apache Airflow DAGs. It enables orchestration of various Fabric items like Notebooks, Pipelines, Spark job definitions, and Semantic Model refreshes. The current version is 0.0.9, and as a newly developed provider, it is expected to have frequent updates.

Warnings

Install

Imports

Quickstart

A basic Apache Airflow DAG demonstrating how to use the `MSFabricRunItemOperator` to trigger a Microsoft Fabric notebook or pipeline. Ensure your Airflow environment has a 'Generic' connection configured with `fabric_conn_id` containing the Microsoft Entra ID (formerly Azure Active Directory) Service Principal credentials (Client ID in Login, Refresh Token in Password, and Tenant ID/Client Secret/Scopes in Extra).

from __future__ import annotations

import pendulum

from airflow.models.dag import DAG
from airflow.providers.microsoft.fabric.operators.run_item import MSFabricRunItemOperator
import os

FABRIC_WORKSPACE_ID = os.environ.get('FABRIC_WORKSPACE_ID', 'your_workspace_id')
FABRIC_ITEM_ID = os.environ.get('FABRIC_ITEM_ID', 'your_item_id') # e.g., Notebook or Pipeline ID
FABRIC_CONN_ID = os.environ.get('FABRIC_CONN_ID', 'fabric_default') # Airflow Connection ID

with DAG(
    dag_id="fabric_run_item_example",
    start_date=pendulum.datetime(2023, 10, 26, tz="UTC"),
    catchup=False,
    schedule=None,
    tags=["microsoft", "fabric", "etl"],
) as dag:
    run_fabric_notebook = MSFabricRunItemOperator(
        task_id="run_fabric_notebook_task",
        workspace_id=FABRIC_WORKSPACE_ID,
        item_id=FABRIC_ITEM_ID,
        fabric_conn_id=FABRIC_CONN_ID,
        job_type="RunNotebook", # or "Pipeline", "SparkJobDefinition", "SemanticModel" etc.
        wait_for_termination=True,
        timeout=60 * 60, # 1 hour timeout
    )

view raw JSON →