sktime: Unified Framework for Time Series Machine Learning
sktime is a Python library for machine learning with time series. It provides a unified interface for various time series tasks, including forecasting, classification, regression, and transformation. It integrates seamlessly with `scikit-learn` and offers a modular architecture for building complex time series pipelines. The project is actively maintained with frequent minor and patch releases, typically every few weeks.
Warnings
- gotcha sktime relies heavily on 'soft dependencies'. Many estimators (especially advanced or classical statistical models) require additional packages that are not installed by default with `pip install sktime`. Attempting to use these estimators without their dependencies will result in `ModuleNotFoundError` or similar.
- breaking sktime undergoes regular API refinements and deprecations. Features, classes, or functions marked for deprecation in one release are often removed in subsequent major/minor releases (e.g., 0.38.0, 0.39.0 introduced scheduled deprecations).
- gotcha sktime has specific Python version requirements (currently `>=3.10, <3.15`). Using an unsupported Python version can lead to installation issues or runtime errors, particularly with dependencies.
- gotcha Compatibility with `scikit-learn` versions can be sensitive. Hotfixes are frequently released to address `scikit-learn` version updates (e.g., 0.38.1 for `scikit-learn 1.7`).
Install
-
pip install sktime -
pip install sktime[all_extras] -
pip install sktime[forecasting]
Imports
- Forecaster
from sktime.forecasting.base import BaseForecaster
- ThetaForecaster
from sktime.forecasting.theta import ThetaForecaster
- temporal_train_test_split
from sktime.forecasting.model_selection import temporal_train_test_split
- StandardScaler
from sktime.transformations.series.scaling import StandardScaler
Quickstart
import pandas as pd
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.forecasting.theta import ThetaForecaster
from sktime.utils.plotting import plot_series # requires 'matplotlib' optional dependency
# 1. Data loading and splitting
y = pd.Series([10, 12, 13, 15, 18, 20, 22, 25, 28, 30])
y.index = pd.to_datetime(pd.date_range("2020-01-01", periods=10, freq="M"))
y_train, y_test = temporal_train_test_split(y, test_size=3)
# 2. Model selection and fitting
forecaster = ThetaForecaster(sp=1)
forecaster.fit(y_train)
# 3. Prediction
fh = [1, 2, 3] # forecast horizon for next 3 periods
y_pred = forecaster.predict(fh=fh)
print("Training data:\n", y_train)
print("Test data:\n", y_test)
print("Predictions:\n", y_pred)
# To plot, ensure matplotlib is installed (pip install matplotlib)
# plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])