Azure ML Train Core
The `azureml-train-core` package provides core functionalities for training models within the Azure Machine Learning Python SDK. It underpins concepts like estimators and run configurations for submitting training jobs to Azure ML workspaces. It is currently at version 1.62.0 and is part of the actively maintained Azure ML SDK, which typically sees monthly or bi-monthly releases.
Warnings
- deprecated The `Estimator` class, while still functional, is largely deprecated for new training scenarios in favor of `ScriptRunConfig` combined with `Environment` objects.
- gotcha `azureml-train-core` is primarily an internal component of the Azure ML SDK. End-users typically interact with its capabilities through higher-level abstractions imported from `azureml.core` (e.g., `Workspace`, `ScriptRunConfig`, `Environment`), rather than direct imports from `azureml.train.core`.
- gotcha Environment management (Conda, Docker) is a common source of errors. Mismatches between local and remote environments, or incorrect `CondaDependencies` specifications, can lead to failed runs or unexpected behavior.
- gotcha The Azure ML SDK components, including `azureml-train-core`, often have strict Python version requirements. Using an unsupported Python version can lead to installation failures or runtime errors.
Install
-
pip install azureml-train-core
Imports
- Workspace
from azureml_train_core.core import Workspace
from azureml.core import Workspace
- ScriptRunConfig
from azureml.train.runconfig import ScriptRunConfig
from azureml.core import ScriptRunConfig
- Environment
from azureml.core import Environment
- Estimator
from azureml.core.estimator import Estimator
Quickstart
import os
from azureml.core import Workspace, ScriptRunConfig, Environment
from azureml.core.conda_dependencies import CondaDependencies
# NOTE: Replace with your actual workspace details or ensure environment variables are set
subscription_id = os.environ.get("AZURE_SUBSCRIPTION_ID", "your_subscription_id")
resource_group = os.environ.get("AZURE_RESOURCE_GROUP", "your_resource_group")
workspace_name = os.environ.get("AZURE_WORKSPACE_NAME", "your_workspace_name")
try:
ws = Workspace.get(name=workspace_name, subscription_id=subscription_id, resource_group=resource_group)
print(f"Found workspace {ws.name} at {ws.get_details()['location']}")
except Exception:
print("Could not connect to workspace. Ensure AZURE_SUBSCRIPTION_ID, AZURE_RESOURCE_GROUP, AZURE_WORKSPACE_NAME env vars are set or replace placeholders.")
# For a real run, you'd create a workspace if not found
# ws = Workspace.create(name=workspace_name, subscription_id=subscription_id, resource_group=resource_group, location='eastus')
# Example: Define an environment
env = Environment('my-training-env')
c_deps = CondaDependencies()
c_deps.add_conda_package('scikit-learn')
c_deps.add_pip_package('azureml-sdk')
env.python.conda_dependencies = c_deps
# Create a dummy training script (e.g., train.py)
# with open('train.py', 'w') as f:
# f.write("""
# import os
# print('Hello from Azure ML training!')
# print(f'Running on compute: {os.environ.get("AML_RUN_ID", "unknown")}')
# """)
# Create a ScriptRunConfig (modern way to submit training)
# src = ScriptRunConfig(source_directory='./',
# script='train.py',
# environment=env,
# compute_target='cpu-cluster') # Replace with actual compute target
# Submit the run (uncomment for actual execution)
# if 'ws' in locals():
# run = ws.submit(src)
# run.wait_for_completion(show_output=True)
# print(f"Run finished with status: {run.get_status()}")
else:
print("Workspace not initialized, skipping run submission example.")