Apache Airflow Microsoft Azure Provider

13.1.0 · active · verified Sat Apr 11

The `apache-airflow-providers-microsoft-azure` package provides Apache Airflow hooks, operators, and sensors for integrating with various Microsoft Azure services, including Blob Storage, Data Lake Storage Gen2, Cosmos DB, and SQL Database. Currently at version 13.1.0, this provider package follows the Apache Airflow project's release cadence, receiving regular updates and new feature additions as part of the broader Airflow ecosystem. It requires Python >= 3.10.

Warnings

Install

Imports

Quickstart

This example demonstrates how to use the `AzureBlobStorageListOperator` to list blobs within a specified container in Azure Blob Storage. It requires an Airflow connection named `azure_blob_default` to be configured, pointing to your Azure storage account with appropriate authentication credentials (e.g., Service Principal, Account Key, SAS Token, or Managed Identity). Remember to replace `"your-container-name"` with an actual container in your Azure storage account.

from __future__ import annotations

import pendulum

from airflow.models.dag import DAG
from airflow.providers.microsoft.azure.operators.blob import AzureBlobStorageListOperator

# Configure an Airflow connection named 'azure_blob_default'
# with details like account_key, sas_token, or service principal details.
# For example, in Airflow UI: Admin -> Connections -> Add a new Connection
# Conn Id: azure_blob_default
# Conn Type: Azure
# Host: <your-azure-storage-account-name>.blob.core.windows.net
# Extra: {"login": "<service-principal-client-id>", "password": "<service-principal-client-secret>", "tenant": "<azure-tenant-id>"}
# Or use environment variables like AIRFLOW_CONN_AZURE_BLOB_DEFAULT

with DAG(
    dag_id="azure_blob_storage_list_example",
    start_date=pendulum.datetime(2023, 1, 1, tz="UTC"),
    schedule=None,
    catchup=False,
    tags=["azure", "blob_storage", "example"],
) as dag:
    list_blobs = AzureBlobStorageListOperator(
        task_id="list_blobs_in_container",
        container_name="your-container-name",  # Replace with an actual Azure Blob Storage container name
        azure_blob_conn_id="azure_blob_default",  # Ensure this connection ID is configured in Airflow
        # Optional: prefix='my-folder/',
        # Optional: show_only_last_modified=True,
    )

view raw JSON →