Azure Batch Client Library for Python
The `azure-batch` client library for Python enables users to configure compute nodes and pools, define tasks, and manage jobs for large-scale parallel and high-performance computing (HPC) applications in Azure. The current stable version is 14.2.0, with ongoing development as part of the broader Azure SDK for Python, which typically sees monthly releases for various packages.
Warnings
- breaking Version 15.x and above introduces significant changes and improvements from v14.x and below. A migration guide is available in the official GitHub repository's README.
- breaking The `lifetime statistics API` was removed in version 14.0.0. Specifically, `job.get_all_lifetime_statistics` and `pool.get_all_lifetime_statistics` are no longer supported.
- deprecated CertificateOperations-related methods were deprecated in version 14.0.0 and were scheduled for removal after February 2024. Users are advised to use Azure KeyVault Extension instead.
- gotcha Microsoft Entra ID (formerly Azure AD) authentication using `azure-identity` and `DefaultAzureCredential` is strongly recommended for `azure-batch`. Some Batch capabilities require this, and Batch account API authentication can be restricted to only Microsoft Entra ID, rejecting shared key authentication.
- gotcha When installing packages on Batch compute nodes, if using StartTask, ensure the Python environment is correctly set up. Common issues include `pip` not being recognized or incorrect Python versions.
- gotcha The default task retention time for all tasks was changed from infinite to 7 days. This means task-related files and stdout/stderr logs will only be available for 7 days by default.
Install
-
pip install azure-batch azure-identity -
pip install azure-batch==15.1.0b3 azure-identity
Imports
- BatchServiceClient
from azure.batch import BatchServiceClient
- models
from azure.batch import models
- DefaultAzureCredential
from azure.identity import DefaultAzureCredential
- SharedKeyCredentials
from azure.batch.batch_auth import SharedKeyCredentials
Quickstart
import os
from azure.batch import BatchServiceClient
from azure.identity import DefaultAzureCredential
# Retrieve Batch account details from environment variables
batch_account_url = os.environ.get("AZURE_BATCH_ACCOUNT_URL", "https://<your-batch-account>.westus.batch.azure.com")
# Authenticate using DefaultAzureCredential (recommended for Azure AD)
# This will attempt to authenticate via environment variables, managed identity, etc.
try:
credential = DefaultAzureCredential()
batch_client = BatchServiceClient(credential, batch_url=batch_account_url)
# Example: List existing pools
pools = batch_client.pool.list()
print(f"Successfully connected to Azure Batch. Found {len(pools)} pools.")
for pool in pools:
print(f" - Pool ID: {pool.id}, VM Size: {pool.vm_size}")
except Exception as e:
print(f"Error connecting to Azure Batch: {e}")
print("Please ensure AZURE_BATCH_ACCOUNT_URL is set and your environment is authenticated (e.g., via Azure CLI).")
# For shared key authentication (less recommended, but available):
# batch_account_name = os.environ.get("AZURE_BATCH_ACCOUNT_NAME", "<your-batch-account-name>")
# batch_account_key = os.environ.get("AZURE_BATCH_ACCOUNT_KEY", "<your-batch-account-key>")
# if batch_account_name and batch_account_key:
# from azure.batch.batch_auth import SharedKeyCredentials
# creds = SharedKeyCredentials(batch_account_name, batch_account_key)
# batch_client_shared_key = BatchServiceClient(creds, batch_url=batch_account_url)
# print("Successfully connected with Shared Key credentials.")