Azure Batch Client Library for Python
The `azure-batch` client library for Python enables users to configure compute nodes and pools, define tasks, and manage jobs for large-scale parallel and high-performance computing (HPC) applications in Azure. The current stable version is 14.2.0, with ongoing development as part of the broader Azure SDK for Python, which typically sees monthly releases for various packages.
Common errors
-
ModuleNotFoundError: No module named 'azure.batch'
cause The 'azure-batch' package is not installed in the Python environment.fixInstall the package using pip: 'pip install azure-batch'. -
AttributeError: 'BatchServiceClient' object has no attribute 'job'
cause The 'BatchServiceClient' object does not have a 'job' attribute; the correct attribute is 'job_operations'.fixUse 'batch_service_client.job_operations' instead of 'batch_service_client.job'. -
TypeError: __init__() missing 1 required positional argument: 'batch_url'
cause The 'batch_url' parameter is required when initializing 'BatchServiceClient'.fixProvide the 'batch_url' parameter when creating the client: 'BatchServiceClient(credentials, batch_url)'. -
ImportError: cannot import name 'BatchServiceClient' from 'azure.batch'
cause The import statement is incorrect; 'BatchServiceClient' should be imported from 'azure.batch' directly.fixUse 'from azure.batch import BatchServiceClient' instead of 'from azure.batch.models import BatchServiceClient'. -
ValueError: The 'id' parameter is required and cannot be None.
cause The 'id' parameter is missing or set to None when creating a job or pool.fixEnsure that the 'id' parameter is provided and not None when creating a job or pool.
Warnings
- breaking Version 15.x and above introduces significant changes and improvements from v14.x and below. A migration guide is available in the official GitHub repository's README.
- breaking The `lifetime statistics API` was removed in version 14.0.0. Specifically, `job.get_all_lifetime_statistics` and `pool.get_all_lifetime_statistics` are no longer supported.
- deprecated CertificateOperations-related methods were deprecated in version 14.0.0 and were scheduled for removal after February 2024. Users are advised to use Azure KeyVault Extension instead.
- gotcha Microsoft Entra ID (formerly Azure AD) authentication using `azure-identity` and `DefaultAzureCredential` is strongly recommended for `azure-batch`. Some Batch capabilities require this, and Batch account API authentication can be restricted to only Microsoft Entra ID, rejecting shared key authentication.
- gotcha When installing packages on Batch compute nodes, if using StartTask, ensure the Python environment is correctly set up. Common issues include `pip` not being recognized or incorrect Python versions.
- gotcha The default task retention time for all tasks was changed from infinite to 7 days. This means task-related files and stdout/stderr logs will only be available for 7 days by default.
Install
-
pip install azure-batch azure-identity -
pip install azure-batch==15.1.0b3 azure-identity
Imports
- BatchServiceClient
from azure.batch.batch_service_client import BatchServiceClient
from azure.batch import BatchServiceClient
- models
from azure.batch.models import ...
from azure.batch import models
- DefaultAzureCredential
from azure.identity import DefaultAzureCredential
- SharedKeyCredentials
from azure.batch.batch_auth import SharedKeyCredentials
Quickstart
import os
from azure.batch import BatchServiceClient
from azure.identity import DefaultAzureCredential
# Retrieve Batch account details from environment variables
batch_account_url = os.environ.get("AZURE_BATCH_ACCOUNT_URL", "https://<your-batch-account>.westus.batch.azure.com")
# Authenticate using DefaultAzureCredential (recommended for Azure AD)
# This will attempt to authenticate via environment variables, managed identity, etc.
try:
credential = DefaultAzureCredential()
batch_client = BatchServiceClient(credential, batch_url=batch_account_url)
# Example: List existing pools
pools = batch_client.pool.list()
print(f"Successfully connected to Azure Batch. Found {len(pools)} pools.")
for pool in pools:
print(f" - Pool ID: {pool.id}, VM Size: {pool.vm_size}")
except Exception as e:
print(f"Error connecting to Azure Batch: {e}")
print("Please ensure AZURE_BATCH_ACCOUNT_URL is set and your environment is authenticated (e.g., via Azure CLI).")
# For shared key authentication (less recommended, but available):
# batch_account_name = os.environ.get("AZURE_BATCH_ACCOUNT_NAME", "<your-batch-account-name>")
# batch_account_key = os.environ.get("AZURE_BATCH_ACCOUNT_KEY", "<your-batch-account-key>")
# if batch_account_name and batch_account_key:
# from azure.batch.batch_auth import SharedKeyCredentials
# creds = SharedKeyCredentials(batch_account_name, batch_account_key)
# batch_client_shared_key = BatchServiceClient(creds, batch_url=batch_account_url)
# print("Successfully connected with Shared Key credentials.")