SageMaker MLOps
The `sagemaker-mlops` library provides modular, reusable components for building MLOps pipelines on Amazon SageMaker. It simplifies the orchestration of machine learning workflows, including model building, training, evaluation, and deployment. The current version is 1.7.1, and it receives regular updates in line with SageMaker SDK and AWS service evolution.
Warnings
- gotcha Insufficient AWS IAM permissions are a common source of errors. The SageMaker execution role used by MLOps pipelines needs permissions for SageMaker, S3 (read/write to specified buckets), ECR (pulling images), and potentially other services like KMS, CloudWatch, or Step Functions depending on the pipeline complexity.
- breaking Strict dependency on specific `sagemaker` SDK versions. `sagemaker-mlops` typically depends on a recent `sagemaker` SDK version (e.g., `>=2.176.0`). Installing an older or incompatible version of `sagemaker` can lead to runtime errors or unexpected behavior due to API changes.
- gotcha AWS region and S3 bucket consistency is crucial. All SageMaker resources (pipelines, training jobs, models) and S3 buckets used for input/output data or artifacts must reside in the same AWS region. S3 bucket names also need to be globally unique.
- gotcha The `get_execution_role()` utility function is designed to work within a SageMaker execution environment (e.g., SageMaker Notebook Instances or Studio). When running scripts locally or outside SageMaker, it may fail, requiring explicit role ARN provision.
Install
-
pip install sagemaker-mlops
Imports
- ModelBuild
from sagemaker_mlops.model_build import ModelBuild
- MLFlowPipeline
from sagemaker_mlops.pipelines import MLFlowPipeline
- get_execution_role
from sagemaker_mlops.utils import get_execution_role
Quickstart
import os
import sagemaker
from sagemaker_mlops.model_build import ModelBuild
from sagemaker_mlops.utils import get_execution_role
# Configure AWS environment (replace with your actual values or env vars)
aws_region = os.environ.get("AWS_REGION", "us-east-1")
aws_account_id = os.environ.get("AWS_ACCOUNT_ID", "123456789012") # Placeholder
sagemaker_execution_role_arn = os.environ.get(
"SAGEMAKER_ROLE_ARN", f"arn:aws:iam::{aws_account_id}:role/service-role/AmazonSageMaker-ExecutionRole-20231201T123456"
) # Ensure this role has SageMaker, S3, ECR permissions
# Initialize SageMaker session
try:
sagemaker_session = sagemaker.Session(
sagemaker_client=sagemaker.boto_session.client("sagemaker", region_name=aws_region),
default_bucket=f"sagemaker-mlops-quickstart-{aws_account_id}-{aws_region}" # Unique bucket name
)
except Exception as e:
print(f"Warning: Could not create SageMaker session directly, possibly due to missing credentials. Error: {e}")
# Fallback for demonstration if not in an AWS environment
class MockSageMakerSession:
def default_bucket(self): return "mock-sagemaker-bucket"
def default_bucket_prefix(self): return "mock-prefix"
def upload_data(self, *args, **kwargs): pass
sagemaker_session = MockSageMakerSession()
# Get SageMaker execution role (prioritize env var or default for local testing)
try:
# This function works best within a SageMaker Notebook or Studio environment
role = get_execution_role(sagemaker_session)
except ValueError:
print("Could not retrieve SageMaker execution role from session. Using provided ARN.")
role = sagemaker_execution_role_arn
if "123456789012" in role:
print("WARNING: Using placeholder SageMaker execution role ARN. Please update 'SAGEMAKER_ROLE_ARN' env var.")
# Define an example ModelBuild component for a SageMaker Pipeline
# This component encapsulates a SageMaker Estimator configuration
model_build = ModelBuild(
sagemaker_session=sagemaker_session,
role=role,
base_job_name="my-training-job",
instance_type="ml.m5.xlarge",
instance_count=1,
image_uri=sagemaker.image_uris.get_training_image(aws_region, "pytorch", "1.13.1", py_version="py39"),
hyperparameters={
"epochs": 10,
"batch_size": 32,
},
input_data_config=[
sagemaker.TrainingInput(
s3_data=f"s3://{sagemaker_session.default_bucket()}/data/train/",
content_type="text/csv",
s3_data_type="S3Prefix"
)
],
output_data_config={
"s3_output_location": f"s3://{sagemaker_session.default_bucket()}/output/"
},
metrics_definitions=[
{"Name": "train:loss", "Regex": ".*loss=([0-9\\.]+).*"},
]
)
print(f"Successfully instantiated ModelBuild component:")
print(f"- Role: {model_build.role}")
print(f"- Instance Type: {model_build.instance_type}")
print(f"- Image URI: {model_build.image_uri}")
print(f"- Hyperparameters: {model_build.hyperparameters}")
print("\nThis 'model_build' object can now be used as a step within a SageMaker Pipeline.")