{"id":210,"library":"scikit-learn","title":"Scikit-learn (sklearn)","description":"Scikit-learn is a free and open-source machine learning library for Python, built on NumPy and SciPy. It provides a wide range of efficient tools for predictive data analysis, including algorithms for classification, regression, clustering, dimensionality reduction, and model selection. Known for its consistent API and comprehensive documentation, it is actively maintained with a regular release cadence. The latest stable version is 1.8.0.","status":"active","version":"1.8.0","language":"python","source_language":"en","source_url":"https://github.com/scikit-learn/scikit-learn","tags":["machine-learning","data-science","ml","statistics","python","artificial-intelligence"],"install":[{"cmd":"pip install -U scikit-learn","lang":"bash","label":"Install latest stable version"},{"cmd":"conda install -c conda-forge scikit-learn","lang":"bash","label":"Install with Conda"}],"dependencies":[{"reason":"Required for numerical operations and array handling.","package":"numpy","optional":false},{"reason":"Required for scientific computing and various algorithms.","package":"scipy","optional":false},{"reason":"Required for efficient parallel computing.","package":"joblib","optional":false},{"reason":"Required for controlling thread pools.","package":"threadpoolctl","optional":false},{"reason":"Optional, required for plotting capabilities (e.g., plot_ and Display classes).","package":"matplotlib","optional":true},{"reason":"Optional, often used for data handling, especially with feature names support.","package":"pandas","optional":true},{"reason":"Optional, required for some examples.","package":"scikit-image","optional":true},{"reason":"Optional, required for some examples and enhanced plotting.","package":"seaborn","optional":true},{"reason":"Optional, required for some examples and interactive plotting.","package":"plotly","optional":true}],"imports":[{"note":"While 'import sklearn' is the correct module name, the PyPI package to install is 'scikit-learn'. Installing 'sklearn' from PyPI will install a deprecated placeholder package (version 0.0.x) that is not the actual scikit-learn library and will raise warnings or errors. Always use 'pip install scikit-learn'.","wrong":"from sklearn import ClassName (after pip install sklearn)","symbol":"sklearn","correct":"import sklearn"},{"note":"Imports for specific estimators or utilities are typically from submodules like `sklearn.ensemble`, `sklearn.linear_model`, `sklearn.preprocessing`, etc.","symbol":"RandomForestClassifier","correct":"from sklearn.ensemble import RandomForestClassifier"},{"note":"Model selection tools are found in `sklearn.model_selection`.","symbol":"train_test_split","correct":"from sklearn.model_selection import train_test_split"}],"quickstart":{"code":"import numpy as np\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.datasets import make_classification\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.metrics import accuracy_score\n\n# 1. Generate synthetic data\nX, y = make_classification(n_samples=1000, n_features=4, random_state=42)\n\n# 2. Split data into training and testing sets\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n\n# 3. Instantiate a classifier (estimator)\n# Use keyword arguments for parameters, as positional arguments are deprecated (sklearn >= 1.0)\nclf = RandomForestClassifier(n_estimators=100, random_state=42)\n\n# 4. Fit the classifier to the training data\nclf.fit(X_train, y_train)\n\n# 5. Make predictions on the test data\ny_pred = clf.predict(X_test)\n\n# 6. Evaluate the model\naccuracy = accuracy_score(y_test, y_pred)\nprint(f\"Model Accuracy: {accuracy:.2f}\")\n\n# Example of using a preprocessor (e.g., StandardScaler in a pipeline context)\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.pipeline import make_pipeline\nfrom sklearn.linear_model import LogisticRegression\n\npipe = make_pipeline(StandardScaler(), LogisticRegression(random_state=42))\npipe.fit(X_train, y_train)\npipeline_accuracy = accuracy_score(pipe.predict(X_test), y_test)\nprint(f\"Pipeline Accuracy: {pipeline_accuracy:.2f}\")","lang":"python","description":"This quickstart demonstrates a typical Scikit-learn workflow: generating data, splitting it into training and testing sets, training a `RandomForestClassifier` with keyword arguments, making predictions, and evaluating the model. It also shows a simple `Pipeline` combining a preprocessor and a classifier."},"warnings":[{"fix":"Use `pip install scikit-learn` for installation. If you have `sklearn` installed, uninstall it with `pip uninstall sklearn` and then `pip install scikit-learn`. If a dependency requires `sklearn`, report it to their issue tracker or set `SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True` as a last resort.","message":"Do NOT install 'sklearn' from PyPI. The 'sklearn' PyPI package is a deprecated placeholder and will lead to errors or install an outdated/dummy package. Always install the library using 'pip install scikit-learn'.","severity":"breaking","affected_versions":"All versions (installation method)"},{"fix":"Always use keyword arguments when instantiating estimators or calling methods with multiple parameters. For example, use `RandomForestClassifier(n_estimators=100)` instead of `RandomForestClassifier(100)`.","message":"Positional arguments for estimator instantiation and method calls are deprecated since version 0.23 and now raise a TypeError in Scikit-learn 1.0 and later for most parameters.","severity":"breaking","affected_versions":">= 1.0.0 (warnings since 0.23)"},{"fix":"Use `get_feature_names_out` instead to retrieve the names of output features from a transformer.","message":"The `get_feature_names` method on transformers is deprecated.","severity":"deprecated","affected_versions":">= 1.0.0"},{"fix":"Convert `numpy.matrix` inputs to `numpy.ndarray` (e.g., using `.A` attribute or `np.asarray()`) before passing them to Scikit-learn estimators.","message":"Usage of `numpy.matrix` as input to Scikit-learn estimators is deprecated.","severity":"gotcha","affected_versions":">= 1.0.0 (will raise TypeError in 1.2)"},{"fix":"Ensure that the feature names (column names of pandas DataFrames) are consistent between `fit` and subsequent operations (`transform`, `predict`). If feature names are not important, consider converting DataFrames to NumPy arrays (e.g., `df.values`) before passing them to estimators.","message":"Scikit-learn 1.0+ stores feature names in `feature_names_in_` when fitted on pandas DataFrames. Inconsistent feature names during subsequent `transform` (or other non-fit methods) will raise a `FutureWarning` which will become a `ValueError` in version 1.2.","severity":"gotcha","affected_versions":">= 1.0.0 (warnings), >= 1.2.0 (errors)"}],"env_vars":null,"last_verified":"2026-05-12T10:19:25.451Z","next_check":"2026-07-09T00:00:00.000Z","problems":[{"fix":"Run `pip install scikit-learn` in your terminal to install the library.","cause":"The scikit-learn library is not installed in your Python environment, or your environment is not correctly activated.","error":"ModuleNotFoundError: No module named 'sklearn'"},{"fix":"Reshape your 1D array into a 2D array using `array.reshape(-1, 1)` for a single feature vector or `array.reshape(1, -1)` for a single sample.","cause":"Scikit-learn estimators expect input data to be a 2D array (samples, features), even for a single sample or a single feature, but a 1D array was provided.","error":"ValueError: Expected 2D array, got 1D array instead:"},{"fix":"Train the estimator first by calling `model.fit(X_train, y_train)` before attempting to make predictions or transformations.","cause":"You are attempting to use a method like `predict`, `transform`, or `score` on a scikit-learn estimator before it has been trained by calling its `fit` method.","error":"NotFittedError: This XXXX instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator."},{"fix":"First, create an instance of the model (e.g., `model = LinearRegression()`), then call `model.fit(X, y)` on the instance.","cause":"You are attempting to call the `fit` method directly on the scikit-learn model class (e.g., `LinearRegression`) instead of on an instantiated object of that class.","error":"AttributeError: type object 'LinearRegression' has no attribute 'fit'"}],"ecosystem":"pypi","meta_description":null,"install_score":45,"install_tag":"draft","quickstart_score":30,"quickstart_tag":"draft","pypi_latest":null,"install_checks":{"last_tested":"2026-05-12","tag":"draft","tag_description":"notable install failures or slow imports","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.08,"mem_mb":42.8,"disk_size":"270M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.9,"mem_mb":49.8,"disk_size":"270M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":3.75,"mem_mb":53.2,"disk_size":"287M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":3.63,"mem_mb":59.7,"disk_size":"287M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.23,"mem_mb":52.2,"disk_size":"271M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.39,"mem_mb":59,"disk_size":"271M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":3.89,"mem_mb":52.5,"disk_size":"269M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.13,"mem_mb":57.9,"disk_size":"269M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.11,"mem_mb":38.5,"disk_size":"284M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.91,"mem_mb":43.3,"disk_size":"284M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null}]},"quickstart_checks":{"last_tested":"2026-04-23","tag":"draft","tag_description":"notable failures across runtimes","results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":0},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":0},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":0},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":0},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":0}]}}