Azure ML AutoML Core Library
This package contains the non-ML, non-Azure specific common code associated with running AutoML experiments within Azure Machine Learning. It serves as a foundational dependency for higher-level AutoML packages like `azureml-train-automl`, rather than being directly used by most end-users. It is part of the broader Azure ML SDK ecosystem, which typically has a monthly or bi-monthly release cadence, keeping sub-packages in sync.
Common errors
-
ERROR: Could not find a version that satisfies the requirement azureml-automl-core==X.Y.Z (from versions: ...)
cause The specified version of `azureml-automl-core` either does not exist for your Python version, or there's an underlying dependency conflict preventing `pip` from finding a compatible version set.fixVerify your Python version is within the supported range (`>=3.8, <3.12`). If installing directly, consider installing the umbrella `azureml-sdk[automl]` package instead: `pip install azureml-sdk[automl]` to let `pip` resolve compatible dependencies. -
TypeError: 'numpy.random._generator.Generator' object is not callable
cause This error frequently arises from an incompatibility between the installed `numpy` version and other libraries (especially `scikit-learn` or `azureml` components) that expect an older `numpy.random.RandomState` interface.fixThis specific `TypeError` is often resolved by either downgrading `numpy` (e.g., `pip install numpy==1.23.5`) or ensuring all `azureml` packages are at their latest compatible versions via `pip install --upgrade azureml-sdk[automl]` in a clean virtual environment. -
from azureml_automl_core.some_module import SomeClass ModuleNotFoundError: No module named 'azureml_automl_core.some_module'
cause `azureml-automl-core` is predominantly an internal dependency, and its sub-modules are not part of the public API. Direct imports from it are not expected for end-users.fixAvoid importing directly from `azureml_automl_core`. Instead, utilize the public API exposed through `azureml.core` and `azureml.train.automl`. For example, use `from azureml.train.automl import AutoMLConfig`. -
ERROR: Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to a partial uninstall. (or similar errors with other system packages)
cause A common dependency conflict when installing or upgrading `azureml-sdk` components, especially on environments with system-installed packages (e.g., `PyYAML` in base Python environments on some Linux distros or with Anaconda).fixAlways use a virtual environment (`venv` or `conda`) to isolate installations and prevent conflicts with system packages. If in a virtual environment and this occurs, try `pip install --ignore-installed <package-name-causing-conflict>` or in extreme cases `pip install --force-reinstall --no-deps azureml-sdk[automl]` (use with caution). Best practice is a fresh `venv`.
Warnings
- gotcha Direct imports from `azureml_automl_core` are uncommon and generally not recommended for end-users. It contains internal components that are subject to change without notice. The public API for AutoML is exposed through `azureml.train.automl`.
- breaking Strict Python version requirements. This package (and the broader Azure ML SDK) requires Python versions `>=3.8, <3.12`. Using unsupported Python versions will lead to `PackageNotFound` during installation or `ModuleNotFoundError` / runtime errors.
- gotcha Dependency conflicts are extremely common within the `azureml-sdk` ecosystem due to strict version pinning across sub-packages. Installing `azureml-automl-core` alongside other data science libraries (e.g., specific versions of `scikit-learn`, `pandas`, `numpy`) can lead to 'PackageNotFound' or 'VersionConflict' errors.
- deprecated Azure ML SDK components, including AutoML features, undergo regular updates. Some functionalities, classes, or parameters in older versions might be deprecated or removed in newer releases, leading to `DeprecationWarning` or `AttributeError`.
Install
-
pip install azureml-automl-core -
pip install azureml-sdk[automl]
Quickstart
import os
from azureml.core import Workspace, Experiment
from azureml.core.compute import AmlCompute, ComputeTarget
from azureml.core.dataset import Dataset
from azureml.data.datapath import DataPath
from azureml.train.automl import AutoMLConfig
# NOTE: azureml-automl-core is an internal dependency.
# End-users typically interact with AutoML via azureml.train.automl.
# This quickstart demonstrates the standard way to run an AutoML experiment.
# Authenticate and get workspace
# Assumes 'config.json' in current directory or Azure CLI login (az login)
ws = Workspace.from_config()
print(f"Workspace name: {ws.name}")
# Create a compute target (or use an existing one)
compute_name = os.environ.get('AML_COMPUTE_CLUSTER_NAME', 'cpu-cluster') # Example name
try:
compute_target = ComputeTarget(workspace=ws, name=compute_name)
print(f'Found existing compute target: {compute_name}')
except Exception:
print(f'Creating a new compute target: {compute_name}...')
config = AmlCompute.provisioning_configuration(
vm_size='STANDARD_DS3_V2',
max_nodes=4,
idle_seconds_before_scaledown=1800 # Scale down after 30 mins idle
)
compute_target = ComputeTarget.create(ws, compute_name, config)
compute_target.wait_for_completion(show_output=True)
# Register a dataset (using sample data for demonstration)
# Replace with your actual data source or an already registered dataset
# Example: data = Dataset.Tabular.from_delimited_files(path='https://...
# Placeholder using dummy data for syntax:
from azureml.data.data_reference import DataReference
from azureml.data.datastore import Datastore
# If you have a datastore and path to a file:
# default_datastore = ws.get_default_datastore()
# training_data = Dataset.Tabular.from_delimited_files(path=[(default_datastore, 'path/to/your/data.csv')])
# For runnable example without actual data/datastore setup:
# Create a dummy dataset reference (won't actually run, but shows API)
print("Note: This quickstart uses a dummy dataset reference for demonstration purposes.")
print("Replace with your actual data registration for a functional run.")
training_data = Dataset.Tabular.from_delimited_files(path=['https://archive.ics.uci.edu/ml/machine-learning-databases/00267/data_banknote_authentication.txt'])
training_data = training_data.register(workspace=ws, name='dummy_banknote_data', description='Dummy Banknote Data', create_new_version=True)
# Configure AutoML run
automl_config = AutoMLConfig(
task='classification',
primary_metric='accuracy',
experiment_timeout_minutes=15, # Max time in minutes for the experiment
training_data=training_data,
label_column_name='4', # Assuming '4' is the label column in dummy data
compute_target=compute_target,
n_cross_validations=2,
max_concurrent_iterations=2,
max_cores_per_iteration=-1, # Use all available cores
enable_early_stopping=True,
featurization='auto',
debug_log='automl_errors.log'
)
# Create and submit experiment (commented out to prevent accidental billing)
# experiment_name = 'automl-quickstart-exp'
# experiment = Experiment(ws, experiment_name)
# local_run = experiment.submit(automl_config, show_output=True)
# local_run.wait_for_completion(wait_for_completion=True, show_output=True)
# print('AutoML run submitted.')