{"id":3875,"library":"apache-airflow-providers-microsoft-azure","title":"Apache Airflow Microsoft Azure Provider","description":"The `apache-airflow-providers-microsoft-azure` package provides Apache Airflow hooks, operators, and sensors for integrating with various Microsoft Azure services, including Blob Storage, Data Lake Storage Gen2, Cosmos DB, and SQL Database. Currently at version 13.1.0, this provider package follows the Apache Airflow project's release cadence, receiving regular updates and new feature additions as part of the broader Airflow ecosystem. It requires Python >= 3.10.","status":"active","version":"13.1.0","language":"en","source_language":"en","source_url":"https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/index.html","tags":["airflow","azure","provider","cloud","microsoft"],"install":[{"cmd":"pip install apache-airflow-providers-microsoft-azure","lang":"bash","label":"Base Installation"},{"cmd":"pip install 'apache-airflow-providers-microsoft-azure[blob,datalake]' # Install with specific extras for services","lang":"bash","label":"With Azure Service Extras"}],"dependencies":[{"reason":"Core Apache Airflow functionality, required for all provider packages.","package":"apache-airflow","optional":false},{"reason":"Required for Azure Blob Storage operators and hooks. Installed via `[blob]` extra.","package":"azure-storage-blob","optional":true},{"reason":"Required for Azure Resource Manager operations. Installed via `[azure]` or specific extras.","package":"azure-mgmt-resource","optional":true}],"imports":[{"note":"Airflow 1.x `contrib` path was deprecated and removed in Airflow 2.0+.","wrong":"from airflow.contrib.hooks.azure_blob_hook import AzureBlobStorageHook","symbol":"AzureBlobStorageHook","correct":"from airflow.providers.microsoft.azure.hooks.blob import AzureBlobStorageHook"},{"note":"Older versions might have used different top-level or `contrib` import paths before provider packages were standardized.","wrong":"from airflow.operators.azure_blob_storage_operator import AzureBlobStorageListOperator","symbol":"AzureBlobStorageListOperator","correct":"from airflow.providers.microsoft.azure.operators.blob import AzureBlobStorageListOperator"},{"symbol":"AzureDataLakeStorageGen2Hook","correct":"from airflow.providers.microsoft.azure.hooks.datalake import AzureDataLakeStorageGen2Hook"}],"quickstart":{"code":"from __future__ import annotations\n\nimport pendulum\n\nfrom airflow.models.dag import DAG\nfrom airflow.providers.microsoft.azure.operators.blob import AzureBlobStorageListOperator\n\n# Configure an Airflow connection named 'azure_blob_default'\n# with details like account_key, sas_token, or service principal details.\n# For example, in Airflow UI: Admin -> Connections -> Add a new Connection\n# Conn Id: azure_blob_default\n# Conn Type: Azure\n# Host: <your-azure-storage-account-name>.blob.core.windows.net\n# Extra: {\"login\": \"<service-principal-client-id>\", \"password\": \"<service-principal-client-secret>\", \"tenant\": \"<azure-tenant-id>\"}\n# Or use environment variables like AIRFLOW_CONN_AZURE_BLOB_DEFAULT\n\nwith DAG(\n    dag_id=\"azure_blob_storage_list_example\",\n    start_date=pendulum.datetime(2023, 1, 1, tz=\"UTC\"),\n    schedule=None,\n    catchup=False,\n    tags=[\"azure\", \"blob_storage\", \"example\"],\n) as dag:\n    list_blobs = AzureBlobStorageListOperator(\n        task_id=\"list_blobs_in_container\",\n        container_name=\"your-container-name\",  # Replace with an actual Azure Blob Storage container name\n        azure_blob_conn_id=\"azure_blob_default\",  # Ensure this connection ID is configured in Airflow\n        # Optional: prefix='my-folder/',\n        # Optional: show_only_last_modified=True,\n    )\n","lang":"python","description":"This example demonstrates how to use the `AzureBlobStorageListOperator` to list blobs within a specified container in Azure Blob Storage. It requires an Airflow connection named `azure_blob_default` to be configured, pointing to your Azure storage account with appropriate authentication credentials (e.g., Service Principal, Account Key, SAS Token, or Managed Identity). Remember to replace `\"your-container-name\"` with an actual container in your Azure storage account."},"warnings":[{"fix":"Update all import statements from `airflow.contrib` to the new `airflow.providers.microsoft.azure` package structure. For example, `from airflow.contrib.hooks.azure_blob_hook import AzureBlobStorageHook` becomes `from airflow.providers.microsoft.azure.hooks.blob import AzureBlobStorageHook`.","message":"Migration from Airflow 1.x `airflow.contrib` modules to Airflow 2.x+ provider packages. All Azure-related hooks, operators, and sensors moved from `airflow.contrib.hooks.azure_blob_hook`, etc., to `airflow.providers.microsoft.azure.*` paths.","severity":"breaking","affected_versions":"Airflow 1.x to 2.x upgrades"},{"fix":"Carefully review the official documentation for the specific Azure service and desired authentication method. Ensure the Airflow connection 'Extra' field is correctly formatted JSON matching the expected parameters (e.g., `{\"login\": \"<client_id>\", \"password\": \"<client_secret>\", \"tenant\": \"<tenant_id>\"}` for Service Principal).","message":"Azure connection authentication methods are complex and often misconfigured. The provider supports Service Principal (with tenant, client_id, client_secret), Managed Identity, Account Key, and SAS Token. The method used depends on how the Airflow Connection 'Extra' field is populated.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Install the provider with the necessary extras, for example: `pip install 'apache-airflow-providers-microsoft-azure[blob,datalake]'`. Refer to the provider's `setup.py` or documentation for a list of available extras.","message":"Specific Azure client libraries are often installed as 'extras' with the provider package (e.g., `[blob]`, `[datalake]`, `[cosmos]`). If you encounter `ModuleNotFoundError` for an `azure-` client library, it's likely a missing extra.","severity":"gotcha","affected_versions":"All versions"},{"fix":"If migrating a DAG or operator that relied on explicit `xcom_push` arguments, update the task to retrieve the operator's return value directly or use `PythonOperator` if complex XCom pushing is needed. Many operators now implicitly push their significant output as an XCom, accessible via `{{ task_instance.xcom_pull(task_ids='my_task_id') }}`.","message":"Airflow 2.0 introduced significant changes to the `BaseOperator` interface, including how XComs are handled. Operators may no longer support `xcom_push=True` or `do_xcom_push=True` arguments, relying instead on the `template_fields` and `return_value` mechanism.","severity":"breaking","affected_versions":"Airflow 2.0+"},{"fix":"Ensure you are using the correct `[datalake]` extra for ADLS Gen2 (`pip install 'apache-airflow-providers-microsoft-azure[datalake]'`) and refer to the `AzureDataLakeStorageGen2*` classes. If still working with ADLS Gen1, verify if there's a specific older operator/hook for it or if the functionality is still supported.","message":"The `AzureDataLakeStorageGen2Hook` and related operators require `azure-storage-file-datalake` and `azure-identity` client libraries, which are installed via the `[datalake]` extra. Using older `azure-datalake-store` for Gen1 Data Lake will not work with Gen2 operators.","severity":"gotcha","affected_versions":"All versions (specific to ADLS Gen2)"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}