forestci: confidence intervals for scikit-learn forest algorithms
raw JSON → 0.7 verified Fri May 01 auth: no python
forestci provides confidence intervals for random forest predictions using the infinitesimal jackknife method. It supports scikit-learn's RandomForestClassifier, RandomForestRegressor, ExtraTreesClassifier, and ExtraTreesRegressor. Version 0.7 includes bug fixes and improved documentation, with no breaking changes from 0.6. The library is stable but released infrequently.
pip install forestci Common errors
error AttributeError: 'RandomForestRegressor' object has no attribute 'estimators_samples_' ↓
cause Scikit-learn version < 0.24 does not expose estimators_samples_ attribute used by forestci for inbag calculation.
fix
Upgrade scikit-learn to >= 0.24 or use a workaround: pass inbag matrix manually via inbag parameter or use a patch.
error ModuleNotFoundError: No module named 'forestci' ↓
cause forestci is not installed, or installed in a different environment.
fix
pip install forestci
error TypeError: random_forest_error() missing 1 required positional argument: 'X_test' ↓
cause Call signature is random_forest_error(forest, X_test, X_train) with both X_test and X_train required.
fix
Provide both X_test (data for which to compute errors) and X_train (training data).
Warnings
gotcha random_forest_error requires the training data (X_train) to compute the inbag matrix. If you pass the same data for prediction, ensure it is the training set. ↓
fix Use X_train as the 'inbag' argument or pass the training data explicitly.
gotcha calc_inference (formerly calc_corr) has been deprecated. Use random_forest_error for variance estimation and then compute confidence intervals manually. ↓
fix Replace calc_inference with random_forest_error(
rf, X_test, X_train
) + manual CI computation.
gotcha In version 0.7, the random_forest_error function may raise AttributeError if the forest estimator doesn't have estimators_samples_ attribute (e.g., older sklearn versions). ↓
fix Upgrade scikit-learn to 0.24+ or use the forestci compatible version. Check estimator type.
Imports
- calc_inference
from forestci import calc_inference - random_forest_error
from forestci import random_forest_error - forestci wrong
from forestci import forestcicorrectimport forestci
Quickstart
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import make_regression
from forestci import random_forest_error
import numpy as np
X, y = make_regression(n_samples=100, n_features=4, noise=0.1, random_state=42)
rf = RandomForestRegressor(n_estimators=100, random_state=42)
rf.fit(X, y)
pred = rf.predict(X)
error_var = random_forest_error(rf, X, X)
# 95% confidence interval
ci = 1.96 * np.sqrt(error_var)
print(ci[:5])