{"id":7809,"library":"tsfresh","title":"tsfresh","description":"tsfresh extracts relevant characteristics from time series data, enabling automated feature engineering for machine learning tasks. It supports a wide range of feature calculators, parallel processing, and integrated feature selection. The current version is 0.21.1, and it typically releases new versions every few months, often including bug fixes, dependency updates, and occasionally breaking changes.","status":"active","version":"0.21.1","language":"en","source_language":"en","source_url":"https://github.com/blue-yonder/tsfresh","tags":["time series","feature extraction","machine learning","data science","pandas"],"install":[{"cmd":"pip install tsfresh","lang":"bash","label":"Standard Install"},{"cmd":"pip install tsfresh[dask,matrix_profile,pywavelets]","lang":"bash","label":"Install with all optional dependencies"}],"dependencies":[{"reason":"Optional: For distributed parallel feature extraction (n_jobs > 1)","package":"dask","optional":true},{"reason":"Optional: For distributed parallel feature extraction (n_jobs > 1)","package":"distributed","optional":true},{"reason":"Optional: For matrix profile related features","package":"matrixprofile","optional":true},{"reason":"Optional: For continuous wavelet transform features, especially with scipy >= 1.15","package":"pywavelets","optional":true}],"imports":[{"symbol":"extract_features","correct":"from tsfresh import extract_features"},{"symbol":"select_features","correct":"from tsfresh import select_features"},{"symbol":"impute","correct":"from tsfresh.utilities.dataframe_functions import impute"},{"symbol":"MinimalFCParameters","correct":"from tsfresh.feature_extraction import MinimalFCParameters"},{"symbol":"EfficientFCParameters","correct":"from tsfresh.feature_extraction import EfficientFCParameters"},{"symbol":"ComprehensiveFCParameters","correct":"from tsfresh.feature_extraction import ComprehensiveFCParameters"}],"quickstart":{"code":"import pandas as pd\nfrom tsfresh import extract_features\nfrom tsfresh.utilities.dataframe_functions import impute\nfrom tsfresh.feature_extraction import MinimalFCParameters\n\n# Create a sample time series DataFrame\n# 'id' identifies different time series\n# 'time' is the time index within each series (can be datetime or int)\n# 'value' is the measurement\ndf = pd.DataFrame({\n    'id': [1, 1, 1, 2, 2, 2, 3, 3, 3],\n    'time': [1, 2, 3, 1, 2, 3, 1, 2, 3],\n    'value': [10, 12, 11, 5, 6, 7, 8, 8, 9]\n})\n\n# Define feature extraction settings (e.g., Minimal for speed)\nsettings = MinimalFCParameters()\n\n# Extract features\n# impute_function is recommended to handle NaN values gracefully\nfeatures = extract_features(df,\n                            column_id='id',\n                            column_sort='time',\n                            impute_function=impute,\n                            default_fc_parameters=settings,\n                            n_jobs=0) # Use all CPU cores for parallelization\n\nprint(\"Extracted Features:\")\nprint(features.head())","lang":"python","description":"This quickstart demonstrates how to extract features from a simple pandas DataFrame using `tsfresh`. It creates a dummy time series, defines minimal feature calculation settings, and then extracts features, utilizing parallel processing. The `impute_function` is important for robust handling of missing values."},"warnings":[{"fix":"Upgrade your Python interpreter to version 3.9 or higher.","message":"tsfresh v0.21.0 dropped support for Python 3.7 and 3.8. v0.19.0 dropped Python 3.6. Ensure your Python environment is 3.9 or newer.","severity":"breaking","affected_versions":">=0.19.0"},{"fix":"Install the optional dependency: `pip install tsfresh[matrix_profile]` or `pip install matrixprofile`.","message":"The `matrixprofile` package became an optional dependency in v0.20.0. If you use features relying on matrix profile without installing it, you will encounter `ModuleNotFoundError`.","severity":"breaking","affected_versions":">=0.20.0"},{"fix":"Install Dask and Distributed: `pip install tsfresh[dask]` or `pip install dask distributed`. Also, be aware of potential multiprocessing issues in certain environments (e.g., Jupyter notebooks on Windows).","message":"Parallelization with `n_jobs > 1` (default `n_jobs=0` uses all cores) requires Dask and Distributed. Without them, you'll receive a `RuntimeError` if parallelization is attempted.","severity":"gotcha","affected_versions":"all"},{"fix":"Upgrade to `tsfresh >= 0.21.0` and ensure `pywavelets` is installed (`pip install pywavelets` or `pip install tsfresh[pywavelets]` for newer versions).","message":"Compatibility issues with `scipy` versions 1.15 and higher were fixed in `tsfresh v0.21.0` by relying on the `pywavelets` package for CWT. Older `tsfresh` versions or environments without `pywavelets` might fail.","severity":"gotcha","affected_versions":"<0.21.0 (with scipy >= 1.15)"},{"fix":"Upgrade `tsfresh` to at least `0.20.1` if you are using recent versions of NumPy or Pandas.","message":"`tsfresh v0.20.1` added compatibility with NumPy 1.24 and Pandas 2.0. Using older `tsfresh` versions with newer NumPy/Pandas might lead to unexpected errors or warnings related to API changes.","severity":"gotcha","affected_versions":"<0.20.1 (with numpy >= 1.24 or pandas >= 2.0)"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install the optional `matrixprofile` dependency: `pip install tsfresh[matrix_profile]` or `pip install matrixprofile`.","cause":"Attempting to extract features that depend on the `matrixprofile` library (e.g., `matrix_profile`) without having it installed.","error":"ModuleNotFoundError: No module named 'matrixprofile'"},{"fix":"Install the optional Dask/Distributed dependencies: `pip install tsfresh[dask]` or `pip install dask distributed`.","cause":"You are trying to use parallel feature extraction (`n_jobs > 1` or `n_jobs=0`) but the Dask and Distributed libraries are not installed.","error":"RuntimeError: Please install dask and distributed for parallel processing."},{"fix":"Ensure your DataFrame has a column named 'id' (or whatever you pass to `column_id`) and that it correctly identifies individual time series.","cause":"The DataFrame passed to `extract_features` does not contain a column with the name specified by `column_id`.","error":"ValueError: column_id not found in dataframe"},{"fix":"Ensure `impute_function=impute` is passed to `extract_features`. Also, consider preprocessing your data to handle NaNs explicitly before passing it to `tsfresh` if the issue persists.","cause":"This often occurs when feature calculators expect integer inputs but encounter NaN values in the time series data. While `tsfresh` tries to handle NaNs, some specific cases or older versions might not.","error":"TypeError: Cannot convert float NaN to integer"}]}