{"library":"sagemaker-inference","title":"SageMaker Inference Toolkit","description":"The sagemaker-inference toolkit is an open-source Python library designed to simplify the creation of serving containers for machine learning models on Amazon SageMaker. It provides a model serving stack built on Multi Model Server (MMS), enabling users to easily implement custom inference logic. The current version is 1.10.1, with a regular release cadence addressing bug fixes and new features, including support for newer Python versions and improved dependency management.","language":"python","status":"active","last_verified":"Mon May 18","install":{"commands":["pip install sagemaker-inference"],"cli":null},"imports":["from sagemaker_inference.default_inference_handler import DefaultInferenceHandler","from sagemaker_inference import model_server","from sagemaker_inference.transformer import Transformer","from sagemaker_inference.default_handler_service import DefaultHandlerService"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"import os\nimport json\n\nfrom sagemaker_inference.default_inference_handler import DefaultInferenceHandler\nfrom sagemaker_inference import content_types, decoder, encoder\n\nclass CustomInferenceHandler(DefaultInferenceHandler):\n    def default_model_fn(self, model_dir, context=None):\n        \"\"\"Loads a dummy model for demonstration. In a real scenario, this would load\n        your actual trained model from `model_dir`.\n        \"\"\"\n        print(f\"Loading model from: {model_dir}\")\n        # Simulate loading a model artifact\n        # For example, if you had a 'model.pkl' in model_dir\n        # model_path = os.path.join(model_dir, 'model.pkl')\n        # model = joblib.load(model_path)\n        return {\"status\": \"model_loaded\", \"path\": model_dir}\n\n    def default_input_fn(self, input_data, content_type, context=None):\n        \"\"\"Deserializes the input data from the request. Supports JSON and CSV.\n        \"\"\"\n        if content_type == content_types.JSON:\n            return decoder.decode(input_data, content_type)\n        elif content_type == content_types.CSV:\n            # Assuming CSV is a simple string for this example\n            return input_data.decode('utf-8').split(',')\n        else:\n            raise ValueError(f\"Unsupported content type: {content_type}\")\n\n    def default_predict_fn(self, data, model, context=None):\n        \"\"\"Makes a dummy prediction based on the input data and the loaded model.\n        \"\"\"\n        print(f\"Performing prediction with model: {model} and data: {data}\")\n        if isinstance(data, dict) and 'instances' in data:\n            # Assume a common inference request format\n            predictions = [item * 2 for item in data['instances']]\n        elif isinstance(data, list):\n            predictions = [item + \"_processed\" for item in data]\n        else:\n            predictions = f\"Processed: {data}\"\n        return {\"predictions\": predictions}\n\n    def default_output_fn(self, prediction, accept, context=None):\n        \"\"\"Serializes the prediction result to the requested accept type.\n        Supports JSON.\n        \"\"\"\n        if accept == content_types.JSON:\n            return encoder.encode(prediction, accept)\n        else:\n            raise ValueError(f\"Unsupported accept type: {accept}\")\n\n# To run this in a SageMaker container, you would have a Dockerfile\n# that installs sagemaker-inference and multi-model-server, copies this file\n# as 'inference.py' and sets up the entrypoint to start the model server.\n# e.g., using sagemaker_inference.model_server.start_model_server()\n\n# Example of how to manually test the handler (not typically run directly in a quickstart)\nif __name__ == '__main__':\n    handler = CustomInferenceHandler()\n    model = handler.default_model_fn('/opt/ml/model') # Simulates model_dir\n\n    test_json_input = json.dumps({\"instances\": [1, 2, 3]}).encode('utf-8')\n    json_data = handler.default_input_fn(test_json_input, content_types.JSON)\n    json_prediction = handler.default_predict_fn(json_data, model)\n    json_output = handler.default_output_fn(json_prediction, content_types.JSON)\n    print(f\"JSON Inference Result: {json_output.decode('utf-8')}\")\n\n    test_csv_input = b'hello,world'\n    csv_data = handler.default_input_fn(test_csv_input, content_types.CSV)\n    csv_prediction = handler.default_predict_fn(csv_data, model)\n    csv_output = handler.default_output_fn(csv_prediction, content_types.JSON) # Output as JSON for simplicity\n    print(f\"CSV Inference Result: {csv_output.decode('utf-8')}\")\n","lang":"python","description":"This quickstart demonstrates the core pattern for using `sagemaker-inference` to create a custom inference handler. It defines a `CustomInferenceHandler` class that extends `DefaultInferenceHandler`, overriding `model_fn`, `input_fn`, `predict_fn`, and `output_fn`. These functions are responsible for loading the model, deserializing input, making predictions, and serializing output, respectively. This file would typically be part of your model archive within a SageMaker custom container.","tag":null,"tag_description":null,"last_tested":null,"results":[]},"compatibility":{"tag":null,"tag_description":null,"last_tested":"2026-05-18","installed_version":"1.10.1","pypi_latest":"1.10.1","is_stale":false,"summary":{"python_range":"3.10–3.9","success_rate":100,"avg_install_s":10.7,"avg_import_s":0.98,"wheel_type":"sdist"},"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.96,"mem_mb":22.7,"disk_size":"265.9M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":10.4,"import_time_s":0.68,"mem_mb":22.7,"disk_size":"257M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":1.21,"mem_mb":25.2,"disk_size":"282.9M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":10.2,"import_time_s":1.1,"mem_mb":25.2,"disk_size":"272M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":1.08,"mem_mb":25,"disk_size":"266.7M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":10.7,"import_time_s":1.25,"mem_mb":25.1,"disk_size":"256M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.95,"mem_mb":24.4,"disk_size":"265.5M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":10.4,"import_time_s":1.02,"mem_mb":24.4,"disk_size":"255M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":null,"import_time_s":0.81,"mem_mb":16.7,"disk_size":"268.1M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"sagemaker-inference","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"clean","install_time_s":11.7,"import_time_s":0.76,"mem_mb":16.7,"disk_size":"264M"}]}}