Yellowbrick

1.5 · active · verified Mon Apr 13

Yellowbrick is a Python library that extends the scikit-learn API with a suite of visual analysis and diagnostic tools for machine learning. It allows users to visualize model performance, feature relationships, and evaluate model selection processes directly within their existing scikit-learn workflows. The library is currently at version 1.5 and historically has had a release cadence of several major updates per year, though the latest release (v1.5) is from August 2022.

Warnings

Install

Imports

Quickstart

This example demonstrates how to use the `KElbowVisualizer` to determine the optimal number of clusters for a KMeans model using synthetic data. It fits the visualizer to the data and then displays the resulting elbow plot.

from yellowbrick.cluster import KElbowVisualizer
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs

# Generate synthetic dataset
X, y = make_blobs(n_samples=1000, n_features=12, centers=8, random_state=42)

# Instantiate the clustering model and visualizer
model = KMeans(random_state=42, n_init=10) # n_init added for KMeans > sklearn 1.2
visualizer = KElbowVisualizer(model, k=(2,12))

visualizer.fit(X)        # Fit the data to the visualizer
visualizer.show()        # Finalize and render the figure

view raw JSON →