SageMaker Studio Python Library
The sagemaker-studio Python library is an open-source tool designed to interact with Amazon SageMaker Unified Studio resources. It provides a simplified interface to programmatically access and manage entities such as domains, projects, connections, and databases. The library also includes utility modules for common data operations, including SQL execution, DataFrame manipulation, and Spark session management. The current version is 1.1.11, and it maintains an active development and release cadence, with updates often reflecting the evolution of the broader SageMaker Unified Studio platform.
Warnings
- gotcha When using `sagemaker-studio` outside of an Amazon SageMaker Unified Studio JupyterLab environment, you must explicitly provide AWS credentials and region. This can be done by storing them in an AWS named profile or by passing a `ClientConfig` object during initialization of library components.
- gotcha The `sagemaker-studio` library (specifically for IAM-based domains) is supported only when using Space Distribution Image versions 2.11+ or 3.6+ in JupyterLab Notebooks or the Code Editor. Older image versions will not support the library.
- gotcha Forgetting to shut down compute instances or applications within SageMaker Studio can lead to unexpected AWS costs. This is an operational consideration for the platform, but directly impacts users leveraging this library for resource management.
- gotcha Custom Python modules or local files within your SageMaker Studio environment might require careful path handling. The Studio home folder is typically mapped to `/root` inside the notebook container, so direct relative paths may not work as expected.
Install
-
pip install sagemaker-studio
Imports
- ClientConfig
from sagemaker_studio import ClientConfig
- Domain
from sagemaker_studio import Domain
- Project
from sagemaker_studio import Project
- sqlutils
from sagemaker_studio import sqlutils
- dataframeutils
from sagemaker_studio import dataframeutils
Quickstart
import os
from sagemaker_studio import ClientConfig, Domain, Project
# --- Configure ClientConfig if not in SageMaker Studio JupyterLab ---
# In a SageMaker Studio JupyterLab environment, credentials are automatically pulled.
# Otherwise, provide AWS credentials and region.
# Replace 'us-east-1' and 'your-aws-profile' as needed.
# For demonstration, we use os.environ.get for robustness.
aws_region = os.environ.get('AWS_REGION', 'us-east-1')
aws_profile = os.environ.get('AWS_PROFILE', 'default')
# Instantiate ClientConfig (optional, if running outside of Studio or with specific profile)
try:
client_config = ClientConfig(region_name=aws_region, profile_name=aws_profile)
except Exception as e:
print(f"Could not initialize ClientConfig, proceeding assuming environment provides credentials: {e}")
client_config = None # Or handle more robustly
# --- Interact with a SageMaker Domain and Project ---
# To interact with a Domain or Project, you typically need its ID.
# Replace 'your-domain-id' and 'your-project-id' with actual values.
# If running in Studio, these might be discoverable from the environment.
domain_id = os.environ.get('SAGEMAKER_DOMAIN_ID', 'd-xxxxxxxxxxxx') # Placeholder
project_id = os.environ.get('SAGEMAKER_PROJECT_ID', 'p-yyyyyyyyyyyy') # Placeholder
print(f"Attempting to interact with Domain ID: {domain_id}")
print(f"Attempting to interact with Project ID: {project_id}")
if client_config:
# Initialize Domain with client_config
domain = Domain(id=domain_id, client_config=client_config)
project = Project(id=project_id, domain_id=domain_id, client_config=client_config)
else:
# Initialize Domain without client_config (assumes environment provides)
domain = Domain(id=domain_id)
project = Project(id=project_id, domain_id=domain_id)
# Accessing some properties (these calls might fail if IDs are invalid or permissions are insufficient)
try:
print(f"Domain ID: {domain.id}")
print(f"Project Name: {project.name}")
print(f"Project S3 Path: {project.s3_path}")
except Exception as e:
print(f"Error accessing domain/project properties: {e}")
print("Please ensure the provided Domain/Project IDs are valid and your AWS credentials/permissions are correctly configured.")
# Example of using a utility module (requires an actual connection and query context)
# from sagemaker_studio import sqlutils
# try:
# result = sqlutils.sql("SELECT 1", connection_name="your-connection-name")
# print(f"SQL Query Result: {result}")
# except Exception as e:
# print(f"Error executing SQL query: {e}")