Sweetviz

raw JSON →
2.3.3 verified Mon Apr 27 auth: no python

A pandas-based library to visualize and compare datasets, providing an automated EDA report in HTML. Current version is 2.3.3, with releases about quarterly.

pip install sweetviz
error ModuleNotFoundError: No module named 'sweetviz'
cause Sweetviz is not installed in the current Python environment.
fix
Run pip install sweetviz.
error ImportError: cannot import name 'is_categorical_dtype' from 'pandas.api.types'
cause In older sweetviz (<2.3.0) with newer pandas versions, the import path changed.
fix
Upgrade sweetviz to >=2.3.0.
error AttributeError: module 'numpy' has no attribute 'bool'
cause Sweetviz versions <2.3.2 used deprecated numpy aliases removed in numpy 1.24+.
fix
Upgrade sweetviz to >=2.3.2.
error ValueError: The truth value of a DataFrame is ambiguous
cause Passing a pandas DataFrame where a Series is expected, e.g., using analyze() with a multi-column target feature.
fix
Ensure target_feat is a single column name (string), not a list or DataFrame.
breaking In v2.2.1, incompatibilities with Pandas 2.0+ and numpy 1.24+ were fixed. Upgrade from older versions (<2.2.1) may break if using newer pandas.
fix Upgrade to sweetviz >=2.2.1.
deprecated The 'distutils' import was removed in v2.3.2; earlier versions may cause ImportError in Python 3.12+ where distutils is removed.
fix Update sweetviz to >=2.3.2.
gotcha When using compare() or compare_intra(), the target feature must exist in both dataframes. Missing columns cause silent incorrect reports.
fix Ensure all required columns are present in both datasets before comparison.
gotcha Large datasets (>100k rows) may cause high memory usage; the library loads the entire dataframe into memory and creates a large HTML report.
fix Sample your data or use the 'verbosity' parameter (v2.3.0+) to limit detail. For massive datasets, consider alternatives like ydata-profiling.

Generate an EDA report for the Iris dataset, targeting the 'species' column.

import pandas as pd
import sweetviz as sv

df = pd.read_csv('https://raw.githubusercontent.com/fbdesignpro/sweetviz/master/datasets/iris.csv')
report = sv.analyze(df, target_feat='species')
report.show_html('sweetviz_report.html')