Azure ML AutoML Client
The `azureml-train-automl-client` library is a core component of the Azure Machine Learning Python SDK v1, enabling users to automatically find the best machine learning model and its hyperparameters for various tasks like classification, regression, and forecasting. As of version 1.62.0, it supports Python versions >=3.8 and <3.12. Its release cadence is tightly coupled with the broader Azure ML SDK v1, which typically saw monthly or bi-monthly updates.
Common errors
-
ModuleNotFoundError: No module named 'azureml.train.automl'
cause The `azureml-train-automl-client` package is not installed in the current Python environment or the environment is not activated.fixRun `pip install azureml-train-automl-client`. If using multiple environments, activate the correct one before running your script (e.g., `conda activate my_env`). -
UserErrorException: WorkspaceNotFound: The workspace 'your_workspace_name' could not be found.
cause The provided workspace name, subscription ID, or resource group is incorrect, or the authenticated user does not have access to it.fixVerify the workspace details (name, subscription_id, resource_group) are accurate. Ensure your Azure credentials are set up correctly and have 'Contributor' or 'Azure Machine Learning Data Scientist' role on the workspace/resource group. -
UserErrorException: The provided compute target 'my_compute_cluster' could not be found or is in an invalid state.
cause The compute target specified in `AutoMLConfig` does not exist, is not running, or has been deleted.fixCheck the Azure ML workspace UI to confirm the compute target exists and is in a healthy state. Verify the name used in `AutoMLConfig(compute_target='...')` exactly matches the compute target name in your workspace. -
TypeError: __init__ received an unexpected keyword argument 'some_parameter'
cause You are passing an unsupported parameter to `AutoMLConfig` or a related class, often due to a version mismatch between your installed `azureml-train-automl-client` and the documentation you are following.fixConsult the official Microsoft Learn documentation for your specific `azureml-train-automl-client` version to confirm valid parameters for `AutoMLConfig`. Update your `azureml-train-automl-client` package to the latest version if you are trying to use newer features.
Warnings
- breaking The `azureml-train-automl-client` package is part of the Azure ML SDK v1. Microsoft has released a new v2 SDK (`azure-ai-ml`) with different APIs, import paths, and paradigms. Mixing v1 and v2 SDK components can lead to breaking changes or unexpected behavior.
- gotcha Python version compatibility for `azureml-train-automl-client` and the broader Azure ML SDK v1 can be very strict. Using unsupported Python versions (e.g., Python 3.7 or 3.11/3.12+ for older SDK versions) can lead to installation failures or runtime errors.
- gotcha AutoML runs require a healthy and accessible compute target (e.g., an Azure Machine Learning Compute Instance or Compute Cluster). If the compute target is not found, not running, or lacks sufficient resources, the experiment submission will fail.
- gotcha Input data for AutoML (training data, validation data) must be accessible from the compute target. This usually means registering it as an `azureml.core.Dataset` in the workspace and ensuring the compute target has network access to the datastore.
Install
-
pip install azureml-train-automl-client
Imports
- AutoMLConfig
from azureml.train.automl import AutoMLConfig
- AutoMLTabularFeaturizationConfig
from azureml.train.automl.automl_config import AutoMLTabularFeaturizationConfig
- Workspace
from azureml.train.automl import Workspace
from azureml.core import Workspace
Quickstart
import os
from azureml.core import Workspace, Experiment, Dataset
from azureml.train.automl import AutoMLConfig
# Placeholder for Azure ML Workspace details
subscription_id = os.environ.get('AZURE_SUBSCRIPTION_ID', 'your_subscription_id')
resource_group = os.environ.get('AZURE_RESOURCE_GROUP', 'your_resource_group')
workspace_name = os.environ.get('AZURE_WORKSPACE_NAME', 'your_workspace_name')
# Ensure these are replaced with actual valid values or environment variables
if 'your_' in subscription_id or 'your_' in resource_group or 'your_' in workspace_name:
print("WARNING: Please set AZURE_SUBSCRIPTION_ID, AZURE_RESOURCE_GROUP, AZURE_WORKSPACE_NAME environment variables ")
print("or replace placeholder values in the quickstart code for actual execution.")
# Exit or use dummy data for demonstration if not intended to run live
exit(1)
try:
ws = Workspace.get(name=workspace_name,
subscription_id=subscription_id,
resource_group=resource_group)
print(f"Workspace '{ws.name}' loaded successfully.")
except Exception as e:
print(f"Could not load workspace. Error: {e}")
# Handle workspace creation or authentication error
exit(1)
# Create an experiment
experiment = Experiment(workspace=ws, name='automl-quickstart-experiment')
# Example: Load a registered dataset (replace with your actual dataset)
# For a real run, you'd register a dataset or use a local one.
# This is a dummy to make the code runnable for structure.
# In a real scenario, you'd load your training data like:
# training_data = Dataset.get_by_name(ws, name='my_training_dataset')
# Create a dummy AutoMLConfig (requires actual data and compute for a real run)
automl_config = AutoMLConfig(
task='classification',
primary_metric='accuracy',
experiment_timeout_minutes=30,
training_data=None, # Replace with your actual Dataset object, e.g., training_data
label_column_name='target_column', # Replace with your target column name
compute_target='cpu-cluster', # Replace with your compute target name
enable_early_stopping=True,
n_cross_validations=5,
max_concurrent_iterations=2,
max_cores_per_iteration=-1, # Use all available cores
# Additional settings like featurization, blacklisting, etc. can be added
)
print("AutoMLConfig created. To run, you would submit it:")
print("run = experiment.submit(automl_config, show_output=True)")
print("run.wait_for_completion(show_output=True)")
# To run the experiment:
# run = experiment.submit(automl_config, show_output=True)
# run.wait_for_completion(show_output=True)