Azure ML AutoML Core Library

1.62.0 · active · verified Thu Apr 16

This package contains the non-ML, non-Azure specific common code associated with running AutoML experiments within Azure Machine Learning. It serves as a foundational dependency for higher-level AutoML packages like `azureml-train-automl`, rather than being directly used by most end-users. It is part of the broader Azure ML SDK ecosystem, which typically has a monthly or bi-monthly release cadence, keeping sub-packages in sync.

Common errors

Warnings

Install

Quickstart

The `azureml-automl-core` package is primarily an internal dependency of the Azure ML SDK. End-users typically interact with Azure ML's Automated ML capabilities through the `azureml.train.automl` module, which transparently utilizes this core package. This quickstart demonstrates a typical setup for an AutoML classification experiment using the high-level `azureml.train.automl` API, showing how to connect to a workspace, provision compute, register data, and configure an AutoML run. The `submit` call is commented out to avoid accidental resource usage.

import os
from azureml.core import Workspace, Experiment
from azureml.core.compute import AmlCompute, ComputeTarget
from azureml.core.dataset import Dataset
from azureml.data.datapath import DataPath
from azureml.train.automl import AutoMLConfig

# NOTE: azureml-automl-core is an internal dependency. 
# End-users typically interact with AutoML via azureml.train.automl.
# This quickstart demonstrates the standard way to run an AutoML experiment.

# Authenticate and get workspace
# Assumes 'config.json' in current directory or Azure CLI login (az login)
ws = Workspace.from_config()
print(f"Workspace name: {ws.name}")

# Create a compute target (or use an existing one)
compute_name = os.environ.get('AML_COMPUTE_CLUSTER_NAME', 'cpu-cluster') # Example name
try:
    compute_target = ComputeTarget(workspace=ws, name=compute_name)
    print(f'Found existing compute target: {compute_name}')
except Exception:
    print(f'Creating a new compute target: {compute_name}...')
    config = AmlCompute.provisioning_configuration(
        vm_size='STANDARD_DS3_V2', 
        max_nodes=4,
        idle_seconds_before_scaledown=1800 # Scale down after 30 mins idle
    )
    compute_target = ComputeTarget.create(ws, compute_name, config)
    compute_target.wait_for_completion(show_output=True)

# Register a dataset (using sample data for demonstration)
# Replace with your actual data source or an already registered dataset
# Example: data = Dataset.Tabular.from_delimited_files(path='https://...
# Placeholder using dummy data for syntax:
from azureml.data.data_reference import DataReference
from azureml.data.datastore import Datastore
# If you have a datastore and path to a file:
# default_datastore = ws.get_default_datastore()
# training_data = Dataset.Tabular.from_delimited_files(path=[(default_datastore, 'path/to/your/data.csv')])

# For runnable example without actual data/datastore setup:
# Create a dummy dataset reference (won't actually run, but shows API)
print("Note: This quickstart uses a dummy dataset reference for demonstration purposes.")
print("Replace with your actual data registration for a functional run.")
training_data = Dataset.Tabular.from_delimited_files(path=['https://archive.ics.uci.edu/ml/machine-learning-databases/00267/data_banknote_authentication.txt'])
training_data = training_data.register(workspace=ws, name='dummy_banknote_data', description='Dummy Banknote Data', create_new_version=True)

# Configure AutoML run
automl_config = AutoMLConfig(
    task='classification',
    primary_metric='accuracy',
    experiment_timeout_minutes=15, # Max time in minutes for the experiment
    training_data=training_data,
    label_column_name='4', # Assuming '4' is the label column in dummy data
    compute_target=compute_target,
    n_cross_validations=2,
    max_concurrent_iterations=2,
    max_cores_per_iteration=-1, # Use all available cores
    enable_early_stopping=True,
    featurization='auto',
    debug_log='automl_errors.log'
)

# Create and submit experiment (commented out to prevent accidental billing)
# experiment_name = 'automl-quickstart-exp'
# experiment = Experiment(ws, experiment_name)
# local_run = experiment.submit(automl_config, show_output=True)
# local_run.wait_for_completion(wait_for_completion=True, show_output=True)
# print('AutoML run submitted.')

view raw JSON →