{"library":"model-hosting-container-standards","title":"Model Hosting Container Standards","description":"The `model-hosting-container-standards` is a Python toolkit designed to facilitate standardized model hosting container implementations, specifically with robust Amazon SageMaker integration. It provides utilities to enable efficient deployment and inference for models, including support for advanced frameworks like TensorRT-LLM and vLLM. Currently at version 0.1.14, the library is actively developed with frequent patch releases, indicating ongoing enhancements and maintenance.","language":"python","status":"active","last_verified":"Thu May 14","install":{"commands":["pip install model-hosting-container-standards"],"cli":null},"imports":["from model_hosting_container_standards.common.handler.model_handler import ModelHandler","from fastapi import FastAPI"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"import boto3\nimport os\n\nsagemaker_client = boto3.client('sagemaker')\n\n# Replace with your AWS account ID and region\naccount_id = os.environ.get('AWS_ACCOUNT_ID', '123456789012')\nregion = os.environ.get('AWS_REGION', 'us-east-1')\n\nmodel_name = 'my-vllm-standard-model'\nexecution_role_arn = os.environ.get('SAGEMAKER_EXECUTION_ROLE_ARN', 'arn:aws:iam::123456789012:role/SageMakerExecutionRole')\n\n# Example of using a vLLM container image that adheres to the standards\n# This image would typically be found in Amazon ECR Public Gallery or a private ECR repo\n# Note: This is an example, use an actual vLLM image URL from AWS ECR Public Gallery.\nvllm_image = f\"{account_id}.dkr.ecr.{region}.amazonaws.com/vllm:0.11.2-sagemaker-v1.2\"\n\nresponse = sagemaker_client.create_model(\n    ModelName=model_name,\n    ExecutionRoleArn=execution_role_arn,\n    PrimaryContainer={\n        'Image': vllm_image,\n        'Environment': {\n            'SM_VLLM_MODEL': 'meta-llama/Meta-Llama-3-8B-Instruct', # Hugging Face Model ID or S3 path\n            'HUGGING_FACE_HUB_TOKEN': os.environ.get('HUGGING_FACE_HUB_TOKEN', ''), # Securely provide token\n            'SM_VLLM_MAX_MODEL_LEN': '2048',\n            'SM_VLLM_GPU_MEMORY_UTILIZATION': '0.9',\n            'SM_VLLM_DTYPE': 'auto',\n            'SM_VLLM_TENSOR_PARALLEL_SIZE': '1'\n        }\n    }\n)\n\nprint(f\"Model creation initiated: {response['ModelArn']}\")\n# Further steps would involve creating an Endpoint Configuration and an Endpoint","lang":"python","description":"This quickstart demonstrates how to deploy a model using Amazon SageMaker, leveraging a container that adheres to the `model-hosting-container-standards`. It configures a SageMaker model with a vLLM-powered container image, setting crucial environment variables for model ID, resource allocation, and optional authentication tokens. This example assumes appropriate AWS credentials and SageMaker execution role are configured in your environment. Note that the toolkit itself is for *building* such containers, and this quickstart shows how to *consume* them on SageMaker.","tag":null,"tag_description":null,"last_tested":"2026-04-25","results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":1}]},"compatibility":{"tag":null,"tag_description":null,"last_tested":"2026-05-14","installed_version":"0.1.15","pypi_latest":"0.1.15","is_stale":false,"summary":{"python_range":"3.10–3.9","success_rate":40,"avg_install_s":4.4,"avg_import_s":null,"wheel_type":"wheel"},"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"38.1M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":4.9,"import_time_s":null,"mem_mb":null,"disk_size":"38M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"42.5M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":4.2,"import_time_s":null,"mem_mb":null,"disk_size":"42M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"42.6M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":4.3,"import_time_s":null,"mem_mb":null,"disk_size":"42M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"42.4M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":0,"wheel_type":"wheel","failure_reason":null,"import_side_effects":"broken","install_time_s":4.3,"import_time_s":null,"mem_mb":null,"disk_size":"42M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":"build_error","import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":"build_error","import_side_effects":null,"install_time_s":1.6,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"model-hosting-container-standards","exit_code":1,"wheel_type":null,"failure_reason":null,"import_side_effects":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null}]}}