SageMaker Studio Python Library

raw JSON →
1.1.11 verified Tue May 12 auth: no python install: draft

The sagemaker-studio Python library is an open-source tool designed to interact with Amazon SageMaker Unified Studio resources. It provides a simplified interface to programmatically access and manage entities such as domains, projects, connections, and databases. The library also includes utility modules for common data operations, including SQL execution, DataFrame manipulation, and Spark session management. The current version is 1.1.11, and it maintains an active development and release cadence, with updates often reflecting the evolution of the broader SageMaker Unified Studio platform.

pip install sagemaker-studio
error ModuleNotFoundError: No module named 'sagemaker_studio'
cause The Python package is installed using a hyphen (`sagemaker-studio`), but import statements in Python require underscores (`sagemaker_studio`).
fix
Change your import statement to use underscores: import sagemaker_studio or from sagemaker_studio import Project.
error botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the GetConnection operation: User: arn:aws:sts::... is not authorized to perform: rds-data:GetConnection on resource: arn:aws:rds:...
cause The IAM execution role associated with your SageMaker Studio environment lacks the necessary permissions to access the underlying AWS resource (e.g., RDS, Redshift, Glue) or perform the requested operation.
fix
Attach the required IAM policies (e.g., AmazonRDSDataFullAccess, AWSGlueConsoleFullAccess, AmazonRedshiftDataFullAccess, or more granular policies) to your SageMaker Studio execution role.
error botocore.exceptions.ClientError: An error occurred (ResourceNotFoundException) when calling the DescribeProject operation: Project my-non-existent-project not found.
cause The specified SageMaker Studio Project or Connection resource does not exist in the current AWS account and region, or your IAM role does not have permission to describe it.
fix
Verify that the name of the Project or Connection is spelled correctly and that it exists in your AWS account and region. Also, ensure your IAM role has sagemaker:DescribeProject and sagemaker:DescribeConnection permissions.
gotcha When using `sagemaker-studio` outside of an Amazon SageMaker Unified Studio JupyterLab environment, you must explicitly provide AWS credentials and region. This can be done by storing them in an AWS named profile or by passing a `ClientConfig` object during initialization of library components.
fix Use environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION), configure an AWS named profile, or initialize `ClientConfig(region_name='...', profile_name='...')` and pass it to object constructors.
gotcha The `sagemaker-studio` library (specifically for IAM-based domains) is supported only when using Space Distribution Image versions 2.11+ or 3.6+ in JupyterLab Notebooks or the Code Editor. Older image versions will not support the library.
fix Ensure your SageMaker Studio Space is running a compatible Space Distribution Image version (2.11+ or 3.6+). Upgrade your image if necessary via the SageMaker Studio console.
gotcha Forgetting to shut down compute instances or applications within SageMaker Studio can lead to unexpected AWS costs. This is an operational consideration for the platform, but directly impacts users leveraging this library for resource management.
fix Always shut down inactive JupyterLab applications, kernel sessions, and other compute resources in the SageMaker Studio console when no longer in use. Consider implementing lifecycle configurations for automatic shutdown.
gotcha Custom Python modules or local files within your SageMaker Studio environment might require careful path handling. The Studio home folder is typically mapped to `/root` inside the notebook container, so direct relative paths may not work as expected.
fix When importing local modules, use `sys.path.insert(0, os.path.abspath('path/to/module'))` or ensure your modules are placed in a location discoverable by Python's path, considering the `/root` mapping.
gotcha When interacting with SageMaker Studio resources that are scoped to a project, the `sagemaker-studio` library requires a Project ID to be available. This ID must be provided explicitly or be discoverable within the execution environment. Failure to provide it will result in a `ValueError: Project ID not found in environment`.
fix Ensure the `SAGEMAKER_PROJECT_ID` environment variable is set with the appropriate Project ID. Alternatively, if the library components support it, pass the project ID directly as an argument to constructors, e.g., `Domain(id='your-domain-id', project_id='your-project-id')`.
breaking Some Python libraries require compilation of C/C++ extensions during installation. When using minimalist base images (like `alpine`) or custom environments that do not include essential build tools and development headers, these installations will fail. This frequently affects data-related libraries like `duckdb` and `snowflake-connector-python` which depend on native code.
fix Ensure your environment has the necessary build tools and development headers installed. For `alpine`-based images, run `apk add build-base python3-dev cmake` before attempting to install such libraries. For Debian/Ubuntu-based images, use `apt-get install build-essential python3-dev cmake`.
python os / libc status wheel install import disk
3.10 alpine (musl) build_error - - - -
3.10 alpine (musl) - - - -
3.10 slim (glibc) wheel 37.7s 2.64s 727M
3.10 slim (glibc) - - 2.50s 704M
3.11 alpine (musl) build_error - - - -
3.11 alpine (musl) - - - -
3.11 slim (glibc) wheel 32.7s 4.03s 773M
3.11 slim (glibc) - - 3.92s 749M
3.12 alpine (musl) build_error - - - -
3.12 alpine (musl) - - - -
3.12 slim (glibc) wheel 29.7s 4.37s 763M
3.12 slim (glibc) - - 4.67s 739M
3.13 alpine (musl) build_error - - - -
3.13 alpine (musl) - - - -
3.13 slim (glibc) wheel 28.3s 4.02s 762M
3.13 slim (glibc) - - 4.33s 738M
3.9 alpine (musl) wheel - 1.06s 57.2M
3.9 alpine (musl) - - 1.04s 57.0M
3.9 slim (glibc) wheel 5.6s 0.93s 58M
3.9 slim (glibc) - - 0.91s 58M

This quickstart demonstrates how to initialize `ClientConfig`, `Domain`, and `Project` objects using the `sagemaker-studio` library. It highlights the importance of credential configuration, either automatically within SageMaker Studio JupyterLab or explicitly via `ClientConfig` with an AWS profile. It then shows how to retrieve basic properties of a SageMaker Domain and Project. Note that running this code successfully requires valid AWS credentials and existing SageMaker Domain and Project IDs, and appropriate IAM permissions.

import os
from sagemaker_studio import ClientConfig, Domain, Project

# --- Configure ClientConfig if not in SageMaker Studio JupyterLab ---
# In a SageMaker Studio JupyterLab environment, credentials are automatically pulled.
# Otherwise, provide AWS credentials and region.
# Replace 'us-east-1' and 'your-aws-profile' as needed.
# For demonstration, we use os.environ.get for robustness.
aws_region = os.environ.get('AWS_REGION', 'us-east-1')
aws_profile = os.environ.get('AWS_PROFILE', 'default')

# Instantiate ClientConfig (optional, if running outside of Studio or with specific profile)
try:
    client_config = ClientConfig(region_name=aws_region, profile_name=aws_profile)
except Exception as e:
    print(f"Could not initialize ClientConfig, proceeding assuming environment provides credentials: {e}")
    client_config = None # Or handle more robustly

# --- Interact with a SageMaker Domain and Project ---
# To interact with a Domain or Project, you typically need its ID.
# Replace 'your-domain-id' and 'your-project-id' with actual values.
# If running in Studio, these might be discoverable from the environment.
domain_id = os.environ.get('SAGEMAKER_DOMAIN_ID', 'd-xxxxxxxxxxxx') # Placeholder
project_id = os.environ.get('SAGEMAKER_PROJECT_ID', 'p-yyyyyyyyyyyy') # Placeholder

print(f"Attempting to interact with Domain ID: {domain_id}")
print(f"Attempting to interact with Project ID: {project_id}")

if client_config:
    # Initialize Domain with client_config
    domain = Domain(id=domain_id, client_config=client_config)
    project = Project(id=project_id, domain_id=domain_id, client_config=client_config)
else:
    # Initialize Domain without client_config (assumes environment provides)
    domain = Domain(id=domain_id)
    project = Project(id=project_id, domain_id=domain_id)

# Accessing some properties (these calls might fail if IDs are invalid or permissions are insufficient)
try:
    print(f"Domain ID: {domain.id}")
    print(f"Project Name: {project.name}")
    print(f"Project S3 Path: {project.s3_path}")
except Exception as e:
    print(f"Error accessing domain/project properties: {e}")
    print("Please ensure the provided Domain/Project IDs are valid and your AWS credentials/permissions are correctly configured.")

# Example of using a utility module (requires an actual connection and query context)
# from sagemaker_studio import sqlutils
# try:
#     result = sqlutils.sql("SELECT 1", connection_name="your-connection-name")
#     print(f"SQL Query Result: {result}")
# except Exception as e:
#     print(f"Error executing SQL query: {e}")