{"id":8988,"library":"featuretools","title":"Featuretools","description":"Featuretools is an open-source Python library for automated feature engineering. It excels at transforming temporal and relational datasets into feature matrices suitable for machine learning. The library, currently at version 1.31.0, is actively maintained by Alteryx and follows a frequent release cadence, often introducing new features and improvements.","status":"active","version":"1.31.0","language":"en","source_language":"en","source_url":"https://github.com/alteryx/featuretools","tags":["feature engineering","machine learning","data science","etl","automated ml"],"install":[{"cmd":"pip install featuretools","lang":"bash","label":"PyPI"},{"cmd":"conda install -c conda-forge featuretools","lang":"bash","label":"Conda-forge"}],"dependencies":[{"reason":"Required for parallel computation when `n_jobs` > 1 in `calculate_feature_matrix`.","package":"dask","optional":true},{"reason":"Required for plotting EntitySets or feature lineage graphs (`EntitySet.plot` or `featuretools.graph_feature`).","package":"graphviz","optional":true}],"imports":[{"symbol":"featuretools","correct":"import featuretools as ft"},{"note":"The EntitySet class is a top-level import since version 1.0.0.","wrong":"from featuretools.entityset import EntitySet","symbol":"EntitySet","correct":"from featuretools import EntitySet"}],"quickstart":{"code":"import featuretools as ft\nimport pandas as pd\n\n# Load mock customer data into an EntitySet\nes = ft.demo.load_mock_customer(return_entityset=True)\n\n# Define target dataframe for feature engineering\ntarget_dataframe_name = \"customers\"\n\n# Run Deep Feature Synthesis (DFS)\nfeature_matrix, feature_defs = ft.dfs(\n    entityset=es,\n    target_dataframe_name=target_dataframe_name,\n    agg_primitives=[\"count\", \"sum\", \"mean\"],\n    trans_primitives=[\"day\", \"month\", \"weekday\"]\n)\n\nprint(feature_matrix.head())","lang":"python","description":"This quickstart demonstrates how to load a multi-table dataset into an EntitySet, define a target dataframe, and then use Deep Feature Synthesis (DFS) to automatically generate a rich set of features. It utilizes built-in aggregation and transform primitives to create new meaningful features for a machine learning task."},"warnings":[{"fix":"Convert Dask or PySpark DataFrames to pandas DataFrames before creating an EntitySet. For Dask, use `.compute()` to get a pandas DataFrame.","message":"As of Featuretools v1.31.0, EntitySets can no longer be created directly from Dask or PySpark DataFrames. This functionality has been removed. Users must convert their Dask/PySpark DataFrames to pandas DataFrames first.","severity":"breaking","affected_versions":">=1.31.0"},{"fix":"All CLI functionalities must now be performed programmatically within Python scripts.","message":"The `featuretools` command-line interface (CLI) has been completely removed in version 1.31.0.","severity":"breaking","affected_versions":">=1.31.0"},{"fix":"Refer to the 'Transitioning to Featuretools Version 1.0' guide for detailed migration steps. Key changes include using `Woodwork DataFrames` and accessing column metadata via the `.ww` accessor on DataFrames within an EntitySet.","message":"Featuretools v1.0.0 introduced significant breaking changes by replacing its legacy custom typing system with Woodwork. The `Entity` and `Variable` classes were removed, and `EntitySet` creation and primitive definitions changed. Columns now use Woodwork `LogicalType` and `semantic_tags` for type information.","severity":"breaking","affected_versions":">=1.0.0"},{"fix":"Install Dask with `pip install \"featuretools[dask]\"` or `pip install dask[dataframe]` before running parallel computations.","message":"Dask is now an optional dependency. If you use `calculate_feature_matrix` with `n_jobs` set to anything other than 1 (to enable parallel processing), you must explicitly install Dask.","severity":"gotcha","affected_versions":">=1.31.0"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure you are accessing Woodwork properties via the DataFrame or a Woodwork-aware object, often by first selecting the column from the `Woodwork DataFrame` and then accessing its `.ww` accessor. Example: `entityset['dataframe_name'].ww['column_name'].logical_type`.","cause":"Attempting to access Woodwork attributes (e.g., `logical_type` or `semantic_tags`) on a pandas Series directly, or using old syntax from pre-1.0.0 versions after migrating to Featuretools 1.0+.","error":"AttributeError: 'Series' object has no attribute 'ww'"},{"fix":"Ensure the time column in your `cutoff_time` DataFrame is explicitly named either 'time' or matches the time index variable name of your target DataFrame in the EntitySet. This was a breaking change in v0.16.0.","cause":"In `featuretools.dfs` or `featuretools.calculate_feature_matrix`, the 'time' column in a `cutoff_time` DataFrame is not correctly named or identified.","error":"AttributeError: Can't get 'time' column from cutoff_time. The column must be labeled either as the target entity's time index variable name or as 'time'."},{"fix":"This issue was a bug in certain minor versions. As of v1.31.0, the fix involved moving this function internally. Users experiencing this on older `1.x` versions should update to the latest patch release (e.g., `1.31.0`) to resolve it.","cause":"The utility function `flatten_list` was moved within the Featuretools internal structure.","error":"ImportError: cannot import name 'flatten_list' from 'featuretools.utils'"},{"fix":"Ensure that the column designated as the index for a DataFrame within the EntitySet contains only unique values. Duplicate index values can lead to unexpected behavior in feature calculation. Review your data and pre-process to ensure index uniqueness.","cause":"When creating an EntitySet or adding a DataFrame, the specified index column contains duplicate values.","error":"UserWarning: Index is not unique on dataframe"}]}