Azure Storage File DataLake Client Library

12.23.0 · active · verified Tue Mar 31

Microsoft Azure File DataLake Storage Client Library for Python provides APIs for interacting with Azure Data Lake Storage Gen2, which offers hierarchical namespace capabilities on top of Azure Blob Storage. This library enables developers to manage file systems, directories, and files, including operations for creating, renaming, deleting, and managing access control lists (ACLs). Azure SDKs typically receive frequent updates, often monthly or bi-monthly, focusing on new features, bug fixes, and alignment with new service API versions.

Warnings

Install

Imports

Quickstart

Demonstrates how to create a `DataLakeServiceClient` using `DefaultAzureCredential` for authentication and then lists all file systems (containers) within the Azure Data Lake Storage Gen2 account. Ensure `AZURE_STORAGE_ACCOUNT_NAME` and appropriate Azure Identity environment variables (e.g., `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`) are set for successful execution.

import os
from azure.storage.filedatalake import DataLakeServiceClient
from azure.identity import DefaultAzureCredential

# Ensure environment variables are set for authentication and account URL:
# AZURE_STORAGE_ACCOUNT_NAME: Name of your Azure Data Lake Storage Gen2 account
# AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET for DefaultAzureCredential

try:
    account_name = os.environ.get("AZURE_STORAGE_ACCOUNT_NAME")
    if not account_name:
        raise ValueError("AZURE_STORAGE_ACCOUNT_NAME environment variable not set.")

    # Construct the account URL for Data Lake Storage Gen2
    # Note: .dfs.core.windows.net is used for Data Lake Storage Gen2 endpoints
    account_url = f"https://{account_name}.dfs.core.windows.net"

    # Authenticate using DefaultAzureCredential (recommended for production)
    # DefaultAzureCredential tries various authentication methods, including environment variables,
    # managed identity, Azure CLI, etc.
    credential = DefaultAzureCredential()

    # Create a DataLakeServiceClient
    service_client = DataLakeServiceClient(account_url, credential=credential)

    print(f"Listing file systems in account: {account_name}")
    file_systems = service_client.list_file_systems()
    for fs in file_systems:
        print(f"- {fs.name}")

except Exception as e:
    print(f"An error occurred: {e}")
    print("Please ensure AZURE_STORAGE_ACCOUNT_NAME and authentication credentials (e.g., AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET for service principal) are correctly configured.")

view raw JSON →