{"id":5084,"library":"tsdownsample","title":"High-Performance Time Series Downsampling","description":"tsdownsample is an extremely fast Python library for time series downsampling, leveraging Rust for its core implementation. It utilizes SIMD instructions and multithreading (via Rayon in Rust) to provide highly optimized, memory-efficient, and flexible algorithms for visualization and analysis of large time series datasets. The library is actively maintained, with its current version being 0.1.4.1.","status":"active","version":"0.1.4.1","language":"en","source_language":"en","source_url":"https://github.com/predict-idlab/tsdownsample","tags":["time series","downsampling","performance","Rust","visualization","SIMD"],"install":[{"cmd":"pip install tsdownsample","lang":"bash","label":"Install with pip"}],"dependencies":[{"reason":"Used for input/output data arrays, fundamental for numerical operations.","package":"numpy"}],"imports":[{"symbol":"MinMaxLTTBDownsampler","correct":"from tsdownsample import MinMaxLTTBDownsampler"},{"symbol":"LTTBDownsampler","correct":"from tsdownsample import LTTBDownsampler"},{"note":"And other downsamplers like EveryNthDownsampler, M4Downsampler, and their NaN-handling variants (e.g., NaNMinMaxDownsampler).","symbol":"MinMaxDownsampler","correct":"from tsdownsample import MinMaxDownsampler"}],"quickstart":{"code":"import numpy as np\nfrom tsdownsample import MinMaxLTTBDownsampler\n\n# Create a time series with x and y values\nx = np.arange(10_000_000, dtype=np.float64)\ny = np.random.randn(10_000_000).astype(np.float64)\n\n# Initialize the downsampler\ndownsampler = MinMaxLTTBDownsampler()\n\n# Downsample the time series to 1000 points\n# The 'parallel=True' argument enables multi-threading for performance.\n# The 'n_out' argument is mandatory.\nselected_indices = downsampler.downsample(x, y, n_out=1000, parallel=True)\n\n# Retrieve the downsampled data points\nx_downsampled = x[selected_indices]\ny_downsampled = y[selected_indices]\n\nprint(f\"Original data points: {len(x)}\")\nprint(f\"Downsampled data points: {len(x_downsampled)}\")\nprint(f\"First 5 downsampled x: {x_downsampled[:5]}\")\nprint(f\"First 5 downsampled y: {y_downsampled[:5]}\")","lang":"python","description":"This quickstart demonstrates how to use `MinMaxLTTBDownsampler` to reduce a large time series dataset to a smaller, representative subset of points for visualization or analysis. It shows how to initialize a downsampler and use the `downsample` method with both `x` (optional time/index) and `y` (values) arrays, specifying the desired output size `n_out` and enabling parallel processing."},"warnings":[{"fix":"Ensure `x` arrays are sorted and free of NaNs before passing them to downsampling functions. The library offers specific `NaN...Downsampler` classes (e.g., `NaNMinMaxDownsampler`) for handling NaNs in `y` data.","message":"The `x` (index) data must be non-strictly monotonic increasing (i.e., sorted) and should not contain NaN values. If not provided, it's assumed to be equally sampled without gaps.","severity":"gotcha","affected_versions":">=0.1.0"},{"fix":"Be aware that the actual output length may vary. If a fixed output length is critical, consider pre-processing or handling the output length post-downsampling.","message":"When `x` data contains gaps (i.e., non-equidistant sampling), the number of returned downsampled indices might be less than the specified `n_out`. This is because no data points can be selected for empty bins.","severity":"gotcha","affected_versions":">=0.1.1"},{"fix":"Set `parallel=True` in your `downsample` calls. For fine-grained control, use `os.environ[\"TSDOWNSAMPLE_MAX_THREADS\"] = \"4\"` before downsampling to limit or specify thread count.","message":"To leverage multi-threading for performance, the `parallel=True` argument must be explicitly passed to the `downsample` method. The maximum number of threads can be configured via the `TSDOWNSAMPLE_MAX_THREADS` environment variable.","severity":"gotcha","affected_versions":">=0.1.0"},{"fix":"Upgrade to version 0.1.4.1 or later to ensure correct numerical precision. If comparing results with older versions, be aware of potential minor numerical differences due to this fix.","message":"A precision error in the `sequential_add_mul` update logic was fixed in version 0.1.4.1. While a bug fix, users relying on the previous (incorrect) numerical behavior might observe different outputs after upgrading.","severity":"gotcha","affected_versions":"<0.1.4.1"},{"fix":"Always pass `n_out` as a keyword argument (e.g., `n_out=1000`) to avoid `TypeError`.","message":"The `downsample` method's signature is `downsample(x, y, n_out=..., **kwargs)`. `x` and `y` are positional arguments, while `n_out` is a mandatory keyword argument.","severity":"gotcha","affected_versions":">=0.1.0"}],"env_vars":null,"last_verified":"2026-04-12T00:00:00.000Z","next_check":"2026-07-11T00:00:00.000Z"}