Azure ML AutoML Training (SDK v1)

1.62.0 · maintenance · verified Thu Apr 16

The `azureml-train-automl` package is part of the Azure Machine Learning SDK v1, designed for automatically finding the best machine learning model and its parameters. It streamlines model selection, hyperparameter tuning, and feature engineering for various ML tasks like classification, regression, and forecasting. As of v1.62.0, Azure Machine Learning SDK v1 is deprecated with support ending on June 30, 2026. Users are strongly advised to migrate to Azure Machine Learning SDK v2 for continued support and new features.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to set up and submit an AutoML experiment using `AutoMLConfig`. It includes connecting to an Azure Machine Learning Workspace, loading sample data, and configuring basic AutoML settings for a classification task. Ensure your Azure ML Workspace is configured (via `config.json` or environment variables) and a compute target is available. For actual execution, remote AmlCompute is recommended over 'local' to avoid extensive local dependency conflicts.

import os
import pandas as pd
from azureml.core.workspace import Workspace
from azureml.core.experiment import Experiment
from azureml.core.dataset import Dataset
from azureml.train.automl import AutoMLConfig

# NOTE: Replace with your actual workspace details or ensure config.json is present
try:
    ws = Workspace.from_config()
    print(f"Workspace loaded: {ws.name}")
except Exception as e:
    print(f"Could not load workspace from config. Attempting environment variables. Error: {e}")
    subscription_id = os.environ.get("AZUREML_SUBSCRIPTION_ID", "YOUR_SUBSCRIPTION_ID")
    resource_group = os.environ.get("AZUREML_RESOURCE_GROUP", "YOUR_RESOURCE_GROUP")
    workspace_name = os.environ.get("AZUREML_WORKSPACE_NAME", "YOUR_WORKSPACE_NAME")
    if "YOUR_" in subscription_id + resource_group + workspace_name:
        raise ValueError("Please configure your Azure ML Workspace details via config.json or environment variables.")
    ws = Workspace.get(name=workspace_name, subscription_id=subscription_id, resource_group=resource_group)
    print(f"Workspace loaded from env: {ws.name}")

experiment_name = "automl-quickstart-exp"
experiment = Experiment(ws, experiment_name)

# Load sample data (replace with your own Dataset registration or data path)
data_url = "https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/creditcard.csv"
df = pd.read_csv(data_url)

# For simplicity, create a dummy Dataset. In a real scenario, you'd register your data.
# Or use a registered dataset: Dataset.get_by_name(ws, name='my_dataset')
from azureml.data.tabulardataset import TabularDataset
training_data = TabularDataset.from_pandas_dataframe(df, target=(ws.get_default_datastore(), 'automl_creditcard.csv'))

# Configure AutoML
automl_config = AutoMLConfig(
    task='classification',
    primary_metric='accuracy',
    training_data=training_data,
    label_column_name='Class',
    compute_target='local',
    experiment_timeout_minutes=15,
    max_concurrent_iterations=1,
    n_cross_validations=2,
    iterations=5,
    verbosity=logging.INFO
)

# Submit the AutoML run
# NOTE: 'local' compute target runs on the current environment, may require many local dependencies.
# For remote compute, configure an AmlCompute target and specify it in AutoMLConfig.
print("Submitting AutoML experiment...")
# run = experiment.submit(automl_config, show_output=True)
# print(f"AutoML experiment submitted: {run.id}")
# print("NOTE: Uncomment the submit line and ensure compute target is configured for actual execution.")

import logging # Ensure logging is imported for verbosity

view raw JSON →