High-Performance Time Series Downsampling

0.1.4.1 · active · verified Sun Apr 12

tsdownsample is an extremely fast Python library for time series downsampling, leveraging Rust for its core implementation. It utilizes SIMD instructions and multithreading (via Rayon in Rust) to provide highly optimized, memory-efficient, and flexible algorithms for visualization and analysis of large time series datasets. The library is actively maintained, with its current version being 0.1.4.1.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `MinMaxLTTBDownsampler` to reduce a large time series dataset to a smaller, representative subset of points for visualization or analysis. It shows how to initialize a downsampler and use the `downsample` method with both `x` (optional time/index) and `y` (values) arrays, specifying the desired output size `n_out` and enabling parallel processing.

import numpy as np
from tsdownsample import MinMaxLTTBDownsampler

# Create a time series with x and y values
x = np.arange(10_000_000, dtype=np.float64)
y = np.random.randn(10_000_000).astype(np.float64)

# Initialize the downsampler
downsampler = MinMaxLTTBDownsampler()

# Downsample the time series to 1000 points
# The 'parallel=True' argument enables multi-threading for performance.
# The 'n_out' argument is mandatory.
selected_indices = downsampler.downsample(x, y, n_out=1000, parallel=True)

# Retrieve the downsampled data points
x_downsampled = x[selected_indices]
y_downsampled = y[selected_indices]

print(f"Original data points: {len(x)}")
print(f"Downsampled data points: {len(x_downsampled)}")
print(f"First 5 downsampled x: {x_downsampled[:5]}")
print(f"First 5 downsampled y: {y_downsampled[:5]}")

view raw JSON →