skpro - Probabilistic Regression & Distribution Framework

raw JSON →
2.12.0 verified Sat May 09 auth: no python

A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in Python. Provides sktime-compatible interfaces for distribution estimation, survival analysis, and conformal prediction. Current version: 2.12.0. Release cadence: ~3-4 major/minor releases per year.

pip install skpro
error ModuleNotFoundError: No module named 'skpro.distributions.normal'
cause Importing from an internal submodule that has been refactored.
fix
Use: from skpro.distributions import NormalDistribution
error AttributeError: 'ProbabilisticRegressor' object has no attribute 'predict'
cause Using .predict for distribution output instead of dedicated methods.
fix
Use reg.predict_dist(X) for distribution, reg.predict_mean(X) for mean, or reg.predict_interval(X) for intervals.
error ValueError: Expected 2D array, got 1D array instead
cause Passing a 1D array as features; skpro expects 2D input (reshaped).
fix
Use X.reshape(-1, 1) if single feature.
breaking Python 3.8 dropped in v2.8.0, Python 3.9 dropped in v2.10.0. Upgrade Python if using older versions.
fix Use Python >=3.10.
deprecated Direct import of distribution classes from skpro.distributions submodules (e.g., skpro.distributions.normal) is deprecated; import from skpro.distributions instead.
fix Use from skpro.distributions import NormalDistribution.
gotcha Predict method returns a distribution object, not point predictions. Use .predict_mean() or .predict_interval() for point/interval estimates.
fix Call reg.predict_mean(X_test) for point predictions, or reg.predict_interval(X_test) for intervals.

Fit a probabilistic regressor and evaluate using continuous ranked probability score (CRPS).

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from skpro.regression import ProbabilisticRegressor
from skpro.metrics import CRPS

X, y = make_regression(n_samples=100, n_features=4, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

reg = ProbabilisticRegressor()
reg.fit(X_train, y_train)

# Predict distribution
y_pred_dist = reg.predict_dist(X_test)

# Evaluate with CRPS
crps = CRPS()
score = crps(y_test, y_pred_dist)
print(f"CRPS: {score}")