SageMaker MLOps

1.7.1 · active · verified Sat Apr 11

The `sagemaker-mlops` library provides modular, reusable components for building MLOps pipelines on Amazon SageMaker. It simplifies the orchestration of machine learning workflows, including model building, training, evaluation, and deployment. The current version is 1.7.1, and it receives regular updates in line with SageMaker SDK and AWS service evolution.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to instantiate a `ModelBuild` component. This component defines the configuration for a SageMaker training job, which is typically used as a step within a larger SageMaker Pipeline. It initializes a SageMaker session and retrieves an execution role, showing how to configure it for local testing or within an AWS SageMaker environment. Placeholder values are used for AWS account ID and role ARN, which must be replaced with actual, valid credentials for execution.

import os
import sagemaker
from sagemaker_mlops.model_build import ModelBuild
from sagemaker_mlops.utils import get_execution_role

# Configure AWS environment (replace with your actual values or env vars)
aws_region = os.environ.get("AWS_REGION", "us-east-1")
aws_account_id = os.environ.get("AWS_ACCOUNT_ID", "123456789012") # Placeholder
sagemaker_execution_role_arn = os.environ.get(
    "SAGEMAKER_ROLE_ARN", f"arn:aws:iam::{aws_account_id}:role/service-role/AmazonSageMaker-ExecutionRole-20231201T123456"
) # Ensure this role has SageMaker, S3, ECR permissions

# Initialize SageMaker session
try:
    sagemaker_session = sagemaker.Session(
        sagemaker_client=sagemaker.boto_session.client("sagemaker", region_name=aws_region),
        default_bucket=f"sagemaker-mlops-quickstart-{aws_account_id}-{aws_region}" # Unique bucket name
    )
except Exception as e:
    print(f"Warning: Could not create SageMaker session directly, possibly due to missing credentials. Error: {e}")
    # Fallback for demonstration if not in an AWS environment
    class MockSageMakerSession:
        def default_bucket(self): return "mock-sagemaker-bucket"
        def default_bucket_prefix(self): return "mock-prefix"
        def upload_data(self, *args, **kwargs): pass
    sagemaker_session = MockSageMakerSession()


# Get SageMaker execution role (prioritize env var or default for local testing)
try:
    # This function works best within a SageMaker Notebook or Studio environment
    role = get_execution_role(sagemaker_session)
except ValueError:
    print("Could not retrieve SageMaker execution role from session. Using provided ARN.")
    role = sagemaker_execution_role_arn
    if "123456789012" in role:
        print("WARNING: Using placeholder SageMaker execution role ARN. Please update 'SAGEMAKER_ROLE_ARN' env var.")

# Define an example ModelBuild component for a SageMaker Pipeline
# This component encapsulates a SageMaker Estimator configuration
model_build = ModelBuild(
    sagemaker_session=sagemaker_session,
    role=role,
    base_job_name="my-training-job",
    instance_type="ml.m5.xlarge",
    instance_count=1,
    image_uri=sagemaker.image_uris.get_training_image(aws_region, "pytorch", "1.13.1", py_version="py39"),
    hyperparameters={
        "epochs": 10,
        "batch_size": 32,
    },
    input_data_config=[
        sagemaker.TrainingInput(
            s3_data=f"s3://{sagemaker_session.default_bucket()}/data/train/",
            content_type="text/csv",
            s3_data_type="S3Prefix"
        )
    ],
    output_data_config={
        "s3_output_location": f"s3://{sagemaker_session.default_bucket()}/output/"
    },
    metrics_definitions=[
        {"Name": "train:loss", "Regex": ".*loss=([0-9\\.]+).*"},
    ]
)

print(f"Successfully instantiated ModelBuild component:")
print(f"- Role: {model_build.role}")
print(f"- Instance Type: {model_build.instance_type}")
print(f"- Image URI: {model_build.image_uri}")
print(f"- Hyperparameters: {model_build.hyperparameters}")
print("\nThis 'model_build' object can now be used as a step within a SageMaker Pipeline.")

view raw JSON →