{"id":9862,"library":"k-means-constrained","title":"K-Means Constrained","description":"K-Means Constrained is a Python library that implements K-Means clustering with user-defined minimum and maximum cluster size constraints. It's based on the constrained k-means algorithm by Bradley, Bennett, & Demiriz (2000). The current version is 0.9.0, and the project maintains an active but moderate release cadence, typically releasing updates a few times a year.","status":"active","version":"0.9.0","language":"en","source_language":"en","source_url":"https://github.com/joshlk/k-means-constrained","tags":["machine-learning","clustering","k-means","constrained-clustering","unsupervised-learning"],"install":[{"cmd":"pip install k-means-constrained","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Numerical operations and array manipulation.","package":"numpy"},{"reason":"Core machine learning utilities and base K-Means functionality.","package":"scikit-learn"},{"reason":"Scientific computing routines, potentially for optimization.","package":"scipy"}],"imports":[{"symbol":"KMeansConstrained","correct":"from k_means_constrained import KMeansConstrained"}],"quickstart":{"code":"import numpy as np\nfrom k_means_constrained import KMeansConstrained\n\n# Sample data\nX = np.array([\n    [1, 2], [1.1, 2.1], [0.9, 1.9],\n    [10, 11], [10.1, 11.1], [9.9, 10.9],\n    [5, 5], [5.1, 5.1], [4.9, 4.9],\n    [20, 21], [20.1, 21.1]\n])\n\n# Initialize and fit the constrained K-Means model\n# n_clusters=3, min_size=2, max_size=4\nclf = KMeansConstrained(\n    n_clusters=3,\n    size_min=2,\n    size_max=4,\n    random_state=0\n)\nclf.fit(X)\n\n# Print cluster assignments and cluster centers\nprint(\"Labels:\", clf.labels_)\nprint(\"Cluster Centers:\\n\", clf.cluster_centers_)\n","lang":"python","description":"Demonstrates how to import `KMeansConstrained`, initialize it with cluster and size constraints, fit it to data, and access the resulting cluster labels and centers."},"warnings":[{"fix":"For very large datasets, consider pre-processing steps like dimensionality reduction or data sampling. Carefully choose `n_clusters`, `size_min`, and `size_max` to balance model requirements with computational feasibility.","message":"Computational complexity increases significantly with large datasets, many clusters, or very tight cluster size constraints. The constrained K-Means problem is NP-hard.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always set the `random_state` parameter in the `KMeansConstrained` constructor (e.g., `random_state=42`) to ensure deterministic and reproducible output across runs.","message":"Results are non-reproducible without setting `random_state`. The initialization of cluster centers and subsequent iterative steps can involve randomness.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure that the total number of samples (`n_samples`) can be consistently partitioned: `n_clusters * size_min <= n_samples <= n_clusters * size_max`. Adjust parameters if an error indicates an impossible configuration.","message":"Incompatible cluster constraints (`n_clusters`, `size_min`, `size_max`) can lead to a `ValueError` or an unsolvable problem, as the algorithm cannot partition the data as requested.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Install the package using `pip install k-means-constrained`. Ensure your import statement is `from k_means_constrained import KMeansConstrained`.","cause":"The `k-means-constrained` package is not installed in your Python environment or the import path is incorrect.","error":"ModuleNotFoundError: No module named 'k_means_constrained'"},{"fix":"Verify that `n_clusters * size_min <= n_samples <= n_clusters * size_max`. Adjust the number of clusters, min/max sizes, or provide more data points to satisfy the constraints.","cause":"The total number of samples (`n_samples`) provided to `fit` is incompatible with the specified `n_clusters`, `size_min`, and `size_max` parameters.","error":"ValueError: Not enough points to satisfy cluster constraints."},{"fix":"Ensure that both `size_min` and `size_max` are integer values when initializing `KMeansConstrained`.","cause":"The `size_min` or `size_max` parameter was provided as a float or other non-integer type.","error":"TypeError: size_min must be an integer"}]}