kmodes Clustering Library

0.12.2 · active · verified Mon Apr 13

Python implementations of the k-modes and k-prototypes clustering algorithms for clustering categorical data. It is currently at version 0.12.2 and sees active development with several releases per year.

Warnings

Install

Imports

Quickstart

Demonstrates basic usage of the KModes algorithm for clustering purely categorical data. Initialize the KModes estimator, fit it to your data, and retrieve the cluster assignments and centroids.

import numpy as np
from kmodes.kmodes import KModes

# Generate random categorical data (e.g., 100 samples, 10 features, 20 unique categories per feature)
data = np.random.choice(20, (100, 10))

# Initialize KModes with 4 clusters, Huang initialization, 5 initialization runs
km = KModes(n_clusters=4, init='Huang', n_init=5, verbose=1)

# Fit the model and predict clusters
clusters = km.fit_predict(data)

# Print the cluster centroids
print("Cluster Centroids:\n", km.cluster_centroids_)
print("Assigned Clusters:\n", clusters[:10]) # Display first 10 assigned clusters

view raw JSON →