Dagster Azure

0.29.0 · active · verified Tue Apr 14

dagster-azure provides a collection of Azure-specific components for the Dagster data orchestration framework, including resources for Blob Storage, Data Lake Gen2, and compute options. The current version is 0.29.0, which aligns with Dagster core 1.13.0. Dagster and its libraries typically follow a monthly release cadence for minor versions.

Warnings

Install

Imports

Quickstart

This example defines an asset `hello_blob_asset` that returns a string. It is configured to use the `blob_storage_io_manager` to persist this string to an Azure Blob Storage container. The `io_manager` automatically handles serialization and deserialization. Ensure you have the `AZURE_STORAGE_ACCOUNT_NAME` and `AZURE_BLOB_CONTAINER_NAME` environment variables set, and that your environment is configured for Azure authentication (e.g., via `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`). To run this example, save it as `repo.py` and run `dagster dev` in the same directory, then launch the `hello_blob_job` from the UI.

import os
from dagster import Definitions, asset, JobDefinition
from dagster_azure.blob.io_manager import blob_storage_io_manager

# Set these environment variables or replace with actual values
# For authentication, 'DefaultAzureCredential' (used by dagster-azure) looks for:
# AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET or AZURE_FEDERATED_TOKEN_FILE
# or uses Managed Identity.
AZURE_STORAGE_ACCOUNT_NAME = os.environ.get(
    "AZURE_STORAGE_ACCOUNT_NAME", "your_storage_account_name"
)
AZURE_BLOB_CONTAINER_NAME = os.environ.get(
    "AZURE_BLOB_CONTAINER_NAME", "your-dagster-container"
)

@asset
def hello_blob_asset():
    """An asset that writes a simple string to Azure Blob Storage."""
    return "Hello, Dagster Azure Blob Storage!"

# Create a job that materializes the asset
hello_blob_job = JobDefinition(name="hello_blob_job", assets=[hello_blob_asset])

defs = Definitions(
    assets=[hello_blob_asset],
    jobs=[hello_blob_job],
    resources={
        "io_manager": blob_storage_io_manager.configured({
            "storage_account_name": AZURE_STORAGE_ACCOUNT_NAME,
            "container": AZURE_BLOB_CONTAINER_NAME,
            "prefix": "dagster_output/" # Optional: objects will be stored under this prefix
        })
    }
)

view raw JSON →