jenkspy
Jenkspy is a Python library providing a fast implementation of the Fisher-Jenks algorithm for computing 'natural breaks'. It's designed for 1-dimensional clustering on lists, tuples, arrays, or NumPy ndarrays of integers/floats to determine optimal class boundaries. Widely used in cartography and data analysis, the library is currently at version 0.4.1 and is actively maintained with recent updates.
Common errors
-
TypeError: 'KeyError' or other issues when passing pandas Series directly.
cause The `jenkspy.jenks_breaks` function does not natively support pandas Series objects; it expects standard Python lists, tuples, or NumPy arrays.fixConvert your pandas Series to a NumPy array before passing it to `jenkspy`: `jenkspy.jenks_breaks(my_series.to_numpy(), n_classes=...)`. -
error: 'x86_64-linux-gnu-gcc': No such file or directory during installation.
cause Jenkspy uses C extensions for performance. This error indicates that a C compiler (like GCC) is not found on your system, which is required to compile these extensions if pre-built wheels are not available for your platform/Python version.fixInstall a C compiler. On Debian/Ubuntu: `sudo apt-get update && sudo apt-get install build-essential`. On macOS, install Xcode Command Line Tools: `xcode-select --install`. For Windows, consider installing the Microsoft Visual C++ Build Tools or using `conda install -c conda-forge jenkspy`.
Warnings
- breaking The `nb_class` parameter for `jenks_breaks` was renamed to `n_classes` in version 0.3.0 to align with scikit-learn conventions. Using `nb_class` in newer versions will raise an error.
- breaking NumPy became a mandatory dependency starting from version 0.3.0. Installations without NumPy will fail or raise import errors.
- breaking Attempting to compute breaks on data containing non-finite values (NaN, Inf) or a non-one-dimensional NumPy array will now raise an error instead of a warning (since 0.2.3).
- gotcha If the requested `n_classes` is greater than the number of unique values in the input data, `jenkspy` will raise an exception (since 0.4.1).
Install
-
pip install jenkspy
Imports
- jenks_breaks
from jenkspy import jenks_breaks # While functional, direct import is less common in examples and can lead to naming conflicts.
import jenkspy breaks = jenkspy.jenks_breaks(data, n_classes=5)
- JenksNaturalBreaks
from jenkspy import JenksNaturalBreaks classifier = JenksNaturalBreaks(n_classes=5)
Quickstart
import jenkspy
import random
# Generate some sample data
data = [random.uniform(0, 100) for _ in range(100)]
# Compute natural breaks with 5 classes using the function API
breaks_func = jenkspy.jenks_breaks(data, n_classes=5)
print(f"Jenks breaks (function API): {breaks_func}")
# Alternatively, use the scikit-learn inspired class API
from jenkspy import JenksNaturalBreaks
classifier = JenksNaturalBreaks(n_classes=5)
classifier.fit(data)
# Get the breaks and group labels
breaks_class = classifier.breaks_
groups = classifier.groups_ # Groups elements into corresponding class indices
print(f"Jenks breaks (class API): {breaks_class}")
print(f"First 10 group labels: {groups[:10]}")