scikit-dimension

raw JSON →
0.3.4 verified Fri May 01 auth: no python

scikit-dimension is a Python module for intrinsic dimension estimation built according to the scikit-learn API. It provides estimators for global and local intrinsic dimension, supporting methods like MLE, DANCo, kNN, and others. Current version 0.3.4 is released under the 3-Clause BSD license. Release cadence is irregular.

pip install scikit-dimension
error ImportError: cannot import name 'MLE' from 'skdim'
cause Incorrect import path. The modules are under `skdim.id` not directly `skdim`.
fix
Use from skdim.id import MLE
error ValueError: Expected 2D array, got 1D array instead
cause Passing a 1D array instead of 2D (samples x features).
fix
Reshape your data: X = X.reshape(-1, 1) or X = np.array(X).reshape(-1, ndim).
error TypeError: fit_transform() takes 2 positional arguments but 3 were given
cause Passing both X and y to fit_transform, but the estimator only expects X.
fix
Call fit_transform(X) without y.
breaking In v0.3, the `predict` method was changed to `transform`. Code using `predict` will break.
fix Replace `.predict(X)` with `.transform(X)` or use `fit_transform(X)`.
gotcha The kNN estimator class is named `kNN` (lowercase k), not `KNN`. Importing `KNN` raises ImportError.
fix Use `from skdim.id import kNN`.
gotcha Some estimators (eg MLE, DANCo) may return negative values if the data is not preprocessed. Intrinsic dimension must be non-negative.
fix Ensure input data is scaled/normalized appropriately, and consider using the `denoising` parameter if available.
deprecated The 'local' keyword argument in some estimators may be deprecated in future versions. Prefer using the dedicated `Local` or `Global` wrapper classes.
fix Use `from skdim.local import ...` for local estimators, or check the documentation for the recommended API.

Basic usage of MLE estimator on random data.

from skdim.id import MLE
import numpy as np

# Generate synthetic data
X = np.random.rand(100, 10)

# Estimate intrinsic dimension
mle = MLE()
dim = mle.fit_transform(X)
print('Estimated dimension:', dim)