{"library":"sagemaker-scikit-learn-extension","title":"SageMaker Scikit-learn Extension","description":"An open-source library that extends scikit-learn functionalities, specifically designed for use with Amazon SageMaker. It provides robust encoders, time series feature extractors, and other transformers to streamline machine learning workflows on SageMaker. The current version is 2.5.0, with regular updates typically released every few months, focusing on new features and bug fixes.","language":"python","status":"active","last_verified":"Sat May 16","install":{"commands":["pip install sagemaker-scikit-learn-extension"],"cli":null},"imports":["from sagemaker_sklearn_extension.encoders import RobustOrdinalEncoder","from sagemaker_sklearn_extension.feature_extraction.timeseries import TSFreshExtractor","from sagemaker_sklearn_extension.encoders import WeightOfEvidenceEncoder","from sagemaker_sklearn_extension.preprocessing import ThresholdOneHotEncoder"],"auth":{"required":false,"env_vars":[]},"quickstart":{"code":"import pandas as pd\nfrom sagemaker_sklearn_extension.encoders import RobustOrdinalEncoder\n\ndata = pd.DataFrame({\n    'category': ['A', 'B', 'A', 'C', 'B', 'A', None, 'D'],\n    'value': [10, 20, 15, 25, 30, 12, 18, 22]\n})\n\n# Initialize the encoder\n# `max_categories` handles categories exceeding this limit as unseen.\n# `handle_unknown='use_encoded_value'` ensures a specific value for unseen/nan.\nencoder = RobustOrdinalEncoder(max_categories=3, handle_unknown='use_encoded_value', unknown_value=-1)\n\n# Fit and transform the 'category' column\nencoded_data = encoder.fit_transform(data[['category']])\n\nprint(\"Original Data:\\n\", data)\nprint(\"\\nEncoded 'category' column:\\n\", encoded_data.reshape(-1))\nprint(\"\\nLearned categories:\", encoder.categories_[0])\n","lang":"python","description":"This quickstart demonstrates how to use the `RobustOrdinalEncoder` to encode categorical data. It handles unknown categories by mapping them to a specified value, preventing errors that might occur with standard ordinal encoders when encountering new data. The example shows fitting and transforming a Pandas DataFrame column.","tag":null,"tag_description":null,"last_tested":null,"results":[]},"compatibility":{"tag":null,"tag_description":null,"last_tested":"2026-05-16","installed_version":"2.5.0","pypi_latest":"2.5.0","is_stale":false,"summary":{"python_range":"3.10–3.9","success_rate":100,"avg_install_s":2.5,"avg_import_s":null,"wheel_type":"sdist"},"results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"19.7M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":2.2,"import_time_s":null,"mem_mb":null,"disk_size":"20M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"22.2M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":2.2,"import_time_s":null,"mem_mb":null,"disk_size":"23M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"12.1M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":3.1,"import_time_s":null,"mem_mb":null,"disk_size":"13M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"11.9M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":2.8,"import_time_s":null,"mem_mb":null,"disk_size":"12M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":"19.3M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"sagemaker-scikit-learn-extension","exit_code":0,"wheel_type":"sdist","failure_reason":null,"import_side_effects":"broken","install_time_s":2.4,"import_time_s":null,"mem_mb":null,"disk_size":"20M"}]}}