DTAIDistance: Dynamic Time Warping for Time Series
DTAIDistance is a Python library providing highly optimized distance measures for time series, primarily focusing on Dynamic Time Warping (DTW). It features both a pure Python implementation and a significantly faster C implementation, making it suitable for performance-critical applications. The library also includes functionalities for time series clustering, multi-dimensional DTW, and subsequence search. It is actively maintained by the DTAI Research Group, with the current stable version being 2.4.0.
Common errors
-
Exception: The compiled dtaidistance C library is not available. See the documentation for alternative installation options.
cause The C extensions for `dtaidistance` failed to compile during installation or could not be found/loaded at runtime. This often happens if a suitable C/C++ compiler or OpenMP libraries are missing on the system.fixEnsure your system has a C/C++ compiler (e.g., 'Build Tools for Visual Studio' on Windows, `gcc` on Linux/macOS) and OpenMP support. Reinstall using `pip install --no-cache-dir --force-reinstall dtaidistance`. If OpenMP issues persist, try `pip install --global-option=--noopenmp dtaidistance`. You can check the C library status with `from dtaidistance import dtw; dtw.try_import_c(verbose=True)`. -
MemoryError: Unable to allocate ... bytes
cause Attempting to compute a distance matrix (`dtw.distance_matrix_fast`) for a large number of time series, causing the resulting NxN matrix (where N is the number of series) to exceed available system memory. DTW itself has quadratic time complexity, but the matrix storage is the main memory culprit.fixReduce the number of series or the length of individual series if possible. Use the `block` argument in `distance_matrix_fast` to compute only a subset of the matrix. Consider using pruning options like `max_dist` or `window` to reduce the computation cost per pair, although this doesn't reduce the matrix storage itself. -
RuntimeWarning: C-library has not been compiled, falling back to Python implementation. This will be slower.
cause The C extensions for `dtaidistance` were not successfully compiled or loaded, so the library is defaulting to its slower pure Python implementation.fixThis is often a symptom of the 'C library not available' problem. Follow the steps to ensure your C/C++ compiler and OpenMP setup are correct and reinstall the library. Using `dtw.distance_fast()` or `dtw.distance_matrix_fast()` explicitly will also trigger this warning if the C library is indeed missing.
Warnings
- breaking In version 2, NumPy became an optional dependency for compiling the C library. While the core C library can be compiled without it (requiring only Cython), most practical applications using NumPy arrays as input will still implicitly require NumPy.
- gotcha The default `dtw.distance()` and `dtw.distance_matrix()` methods use a pure Python implementation, which is significantly slower than the C-based optimized versions. For optimal performance, always use `dtw.distance_fast()` and `dtw.distance_matrix_fast()`.
- gotcha Installing on Windows or certain Unix systems (e.g., macOS) might fail to compile the fast C extensions or correctly link OpenMP, leading to the Python-only fallback or runtime errors when parallelization is expected. This can result in slower performance or the 'C library not available' error.
- gotcha Computing `distance_matrix_fast` for a very large number of long time series can lead to `MemoryError` due to the quadratic space complexity of storing the full distance matrix.
Install
-
pip install dtaidistance -
conda install -c conda-forge dtaidistance
Imports
- dtw
from dtaidistance import dtw
- dtw_visualisation
from dtaidistance import dtw_visualisation as dtwvis
- dtw_ndim
from dtaidistance import dtw_ndim
Quickstart
import numpy as np
from dtaidistance import dtw
# Define two sample time series
s1 = np.array([0.0, 0, 1, 2, 1, 0, 1, 0, 0])
s2 = np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0])
# Compute the DTW distance using the fast C implementation
distance = dtw.distance_fast(s1, s2)
print(f"DTW distance: {distance:.2f}")
# Optionally, visualize the warping path (requires matplotlib)
try:
from dtaidistance import dtw_visualisation as dtwvis
path = dtw.warping_path(s1, s2)
dtwvis.plot_warping(s1, s2, path, filename="warping_path.png")
print("Warping path visualized in warping_path.png")
except ImportError:
print("Matplotlib not installed, skipping visualization.")