Apache Airflow Microsoft Fabric Plugin

1.0.3 · active · verified Thu Apr 09

The `apache-airflow-microsoft-fabric-plugin` provides an operator and hook to interact with Microsoft Fabric items, such as Lakehouse, Notebook, Datafactory, and Datawarehouse, directly from Apache Airflow. It enables running Spark jobs (like notebooks) within Fabric. The current version is 1.0.3, and it is a community-driven project updated as needed.

Warnings

Install

Imports

Quickstart

This example DAG demonstrates how to use the `FabricRunSparkJobOperator` to execute a Microsoft Fabric Spark Notebook. Before running, configure an Airflow Connection with `Conn Id: azure_fabric_default` (or your custom ID), `Conn Type: Microsoft Fabric`, and necessary authentication details (e.g., `tenant_id`, `client_id`, `client_secret` for service principal, or `managed_identity_client_id` for Managed Identity) in its `Extra` JSON field. Replace placeholder IDs with your actual Fabric Workspace, Lakehouse (if applicable), and Notebook IDs.

from __future__ import annotations

import os
import pendulum

from airflow.models.dag import DAG
from apache_airflow_microsoft_fabric_plugin.operators.fabric import FabricRunSparkJobOperator

# Configure your Airflow connection 'azure_fabric_default' with type 'Microsoft Fabric'
# and extra fields like {"tenant_id": "...", "client_id": "...", "client_secret": "..."}
# You can use environment variables for sensitive data in real scenarios.

with DAG(
    dag_id="microsoft_fabric_notebook_execution_dag",
    schedule=None,
    start_date=pendulum.datetime(2023, 10, 26, tz="UTC"),
    catchup=False,
    tags=["microsoft_fabric", "notebook"],
) as dag:
    run_spark_job = FabricRunSparkJobOperator(
        task_id="run_spark_notebook_task",
        fabric_conn_id="azure_fabric_default", # Ensure this matches your Airflow connection ID
        workspace_id=os.environ.get("FABRIC_WORKSPACE_ID", "your_fabric_workspace_id"),
        lakehouse_id=os.environ.get("FABRIC_LAKEHOUSE_ID", "your_fabric_lakehouse_id"), # Optional, if notebook interacts with a specific lakehouse
        notebook_id=os.environ.get("FABRIC_NOTEBOOK_ID", "your_fabric_notebook_id"),
        job_parameters={
            "param1": "airflow_run",
            "dag_run_id": "{{ dag_run.run_id }}"
        } # Optional: pass parameters to your notebook
    )

view raw JSON →