Optimal 1D k-means clustering

0.5.0 · active · verified Wed Apr 15

kmeans1d is a Python package providing an implementation of optimal k-means clustering specifically for one-dimensional data. It utilizes an O(kn + n log n) dynamic programming algorithm, based on research by Xiaolin (1991) and Gronlund et al. (2017), to find globally optimal k clusters. The core logic is written in C++ for performance and wrapped for Python usage. The library is actively maintained, with its current version being 0.5.0.

Warnings

Install

Imports

Quickstart

This example demonstrates how to perform 1D k-means clustering on a sample dataset `x` with `k=4` clusters. It returns the cluster assignments for each data point and the computed centroids for each cluster.

import kmeans1d

x = [4.0, 4.1, 4.2, -50.0, 200.2, 200.4, 200.9, 80.0, 100.0, 102.0]
k = 4

clusters, centroids = kmeans1d.cluster(x, k)

print(f"Clusters: {clusters}")
print(f"Centroids: {centroids}")

view raw JSON →