Yellowbrick
Yellowbrick is a Python library that extends the scikit-learn API with a suite of visual analysis and diagnostic tools for machine learning. It allows users to visualize model performance, feature relationships, and evaluate model selection processes directly within their existing scikit-learn workflows. The library is currently at version 1.5 and historically has had a release cadence of several major updates per year, though the latest release (v1.5) is from August 2022.
Warnings
- breaking The `poof()` method for rendering visualizers was deprecated in v1.0.1 and subsequently removed. It was replaced by `show()`.
- breaking Yellowbrick dropped support for Python 2.x with the release of v1.0.
- gotcha Yellowbrick frequently updates to maintain compatibility with rapidly evolving upstream libraries like scikit-learn, NumPy, SciPy, and Matplotlib. Specific versions of these dependencies can be required.
- breaking The internal `set_params` and `get_params` API for `ModelVisualizers` changed to align with `scikit-learn` v1.0+.
- gotcha The `nltk` library and its associated data are required for text-based visualizers (e.g., `FreqDistVisualizer`, `WordCorrelationPlot`) but are not installed by default with `pip install yellowbrick`.
Install
-
pip install yellowbrick -
pip install yellowbrick[text]
Imports
- KElbowVisualizer
from yellowbrick.cluster import KElbowVisualizer
- ROCAUC
from yellowbrick.classifier import ROCAUC
- ResidualsPlot
from yellowbrick.regressor import ResidualsPlot
- FeatureImportances
from yellowbrick.model_selection import FeatureImportances
- show
visualizer.show()
Quickstart
from yellowbrick.cluster import KElbowVisualizer from sklearn.cluster import KMeans from sklearn.datasets import make_blobs # Generate synthetic dataset X, y = make_blobs(n_samples=1000, n_features=12, centers=8, random_state=42) # Instantiate the clustering model and visualizer model = KMeans(random_state=42, n_init=10) # n_init added for KMeans > sklearn 1.2 visualizer = KElbowVisualizer(model, k=(2,12)) visualizer.fit(X) # Fit the data to the visualizer visualizer.show() # Finalize and render the figure