Beta Calibration
BetaCal provides a Python implementation of Beta Calibration, a method for calibrating predicted probabilities from binary classifiers. It offers a well-founded and easily implemented improvement over traditional logistic calibration, particularly effective when classifiers suffer from scores that tend too much to the extremes or when an already well-calibrated model might be uncalibrated by logistic methods. The current version is 1.1.0, last released in April 2021. While functional and widely cited, the package's release cadence is slow, suggesting a maintenance rather than active development phase.
Warnings
- gotcha The `betacal` library has not seen a new release since April 2021. While the existing version is stable and functional, users should be aware that active development and new features are not frequently added.
- gotcha Choosing the correct `parameters` for `BetaCalibration` ('abm', 'ab', 'am') is crucial. 'abm' is a three-parameter model, 'ab' is a two-parameter model, and 'am' is a one-parameter model. An incorrect choice without understanding the underlying Beta distribution models can lead to suboptimal calibration, especially if the data distribution doesn't align with the chosen model complexity.
- gotcha Beta Calibration is designed to address specific shortcomings of Logistic Calibration, such as uncalibrating already well-calibrated models or performing poorly when scores are at the extremes. It's an alternative to logistic and isotonic calibration, not a direct drop-in replacement without considering these specific benefits.
Install
-
pip install betacal
Imports
- BetaCalibration
from betacal import BetaCalibration
Quickstart
import numpy as np
from betacal import BetaCalibration
# Generate some dummy data
np.random.seed(42)
scores = np.random.rand(100) # Raw classifier scores (probabilities)
labels = np.random.randint(0, 2, 100) # True binary labels
# Initialize BetaCalibration with default 'abm' parameters
# 'abm' (alpha, beta, mu) is a three-parameter model
bc = BetaCalibration(parameters='abm')
# Fit the calibrator to the scores and true labels
bc.fit(scores, labels)
# Predict calibrated probabilities
calibrated_scores = bc.predict(scores)
print(f"Original scores (first 5): {scores[:5].round(3)}")
print(f"Calibrated scores (first 5): {calibrated_scores[:5].round(3)}")