tslearn: Time-Series Machine Learning Toolkit

0.8.1 · active · verified Thu Apr 16

tslearn is a Python package providing a comprehensive machine learning toolkit specifically designed for the analysis of time-series data. It offers various algorithms for clustering, classification, and regression on time series, building upon the `scikit-learn`, `numpy`, and `scipy` libraries. The current version is 0.8.1, and the library is under active development and maintenance.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to prepare time-series data for `tslearn` using `to_time_series_dataset` and then perform clustering with `TimeSeriesKMeans`. The data is formatted into a 3D NumPy array, which is the standard input format for `tslearn` estimators.

import numpy as np
from tslearn.clustering import TimeSeriesKMeans
from tslearn.utils import to_time_series_dataset

# Generate some sample time series data
np.random.seed(0)
n_ts = 10 # Number of time series
max_sz = 100 # Maximum length of time series
d = 1 # Dimensionality of each time point (univariate)

# Create a list of 2D numpy arrays for variable-length time series
my_time_series = []
for i in range(n_ts):
    length = np.random.randint(50, max_sz + 1)
    series = np.random.rand(length, d)
    my_time_series.append(series)

# Convert to tslearn's expected 3D dataset format
X = to_time_series_dataset(my_time_series)

# Initialize and fit a TimeSeriesKMeans model
# Using dtw (Dynamic Time Warping) as the metric
km = TimeSeriesKMeans(n_clusters=2, metric="dtw", max_iter=10, random_state=0)
cluster_labels = km.fit_predict(X)

print(f"Input shape: {X.shape}")
print(f"Cluster labels: {cluster_labels}")

view raw JSON →