KMeans-PyTorch

0.3 · active · verified Fri Apr 17

K-means-pytorch provides a K-means clustering algorithm implementation built on top of PyTorch, enabling GPU acceleration for faster computations. The current version is 0.3, with releases occurring infrequently, often driven by new feature additions or argument clarifications rather than a fixed schedule.

Common errors

Warnings

Install

Imports

Quickstart

This example demonstrates how to generate sample data, run the `kmeans` algorithm, and retrieve the cluster assignments and final cluster centers. Remember to adjust the `device` parameter ('cpu' or 'cuda:0') based on your hardware.

import torch
from kmeans_pytorch import kmeans

# 0. Generate some random data
num_samples = 1000
num_features = 2
X = torch.randn(num_samples, num_features, device='cpu', dtype=torch.float)

# Add some clusters
X[:300] += 5
X[300:600] -= 5
X[600:] += torch.tensor([0, 10], dtype=torch.float)

num_clusters = 3
tolerance = 1e-4
max_iterations = 500
distance_metric = 'euclidean'
device = 'cpu' # Change to 'cuda:0' if a GPU is available

# 1. Run K-means
cluster_ids_x, cluster_centers = kmeans(
    X=X,
    num_clusters=num_clusters,
    distance=distance_metric,
    tol=tolerance,
    max_iter=max_iterations,
    device=device
)

print(f"Cluster IDs shape: {cluster_ids_x.shape}")
print(f"Cluster Centers shape: {cluster_centers.shape}")
print(f"First 5 cluster IDs: {cluster_ids_x[:5]}")
print(f"Cluster centers:\n{cluster_centers}")

view raw JSON →