miceforest
raw JSON → 6.0.5 verified Sat May 09 auth: no python
Fast multiple imputation using Random Forests and LightGBM, implementing Multiple Imputation by Chained Equations (MICE). Current version: 6.0.5, release cadence: irregular major versions with breaking changes.
pip install miceforest Common errors
error ImportError: cannot import name 'ImputedDataSet' from 'miceforest' ↓
cause Class renamed in v5.0.0.
fix
Use 'ImputedData' instead: from miceforest import ImputedData
error TypeError: ImputationKernel.__init__() got an unexpected keyword argument 'mean_match_subset' ↓
cause Parameter removed in v6.0.0.
fix
Use 'data_subset' parameter instead, or adjust your code to v6.x API.
Warnings
breaking In v6.0.0, native numpy array input was removed. Only pandas DataFrames are accepted. ↓
fix Pass a pandas DataFrame instead of a numpy array.
deprecated MeanMatchScheme classes from v5.x are replaced by string parameters 'mean_match_strategy' and 'mean_match_candidates' in v6.0.0. ↓
fix Use e.g., kernel = ImputationKernel(..., mean_match_strategy='shap', mean_match_candidates=10)
breaking In v5.0.0, class names were changed: 'ImputationKernel' is now the main class, and 'ImputedData' replaces 'ImputedDataSet'. Old names cause ImportError. ↓
fix Use 'ImputationKernel' and 'ImputedData' from miceforest.
gotcha When saving/loading kernels, use standard pickle or joblib, but ensure you use '__getstate__' and '__setstate__' overrides. In v6.0.0, modernized saving may break older pickles. ↓
fix If loading a kernel saved with v5.x, re-run imputation on the original data instead of loading.
Imports
- ImputationKernel wrong
from miceforest import MultipleImputedKernelcorrectfrom miceforest import ImputationKernel - ImputedData wrong
from miceforest import ImputedDataSetcorrectfrom miceforest import ImputedData
Quickstart
import miceforest as mf
from sklearn.datasets import load_iris
import pandas as pd
import numpy as np
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df.iloc[:20, 0] = np.nan
kernel = mf.ImputationKernel(dataset=df, datasets=1, save_all_iterations=True, random_state=1)
kernel.mice(1)
completed = kernel.complete_data()
print(completed.isnull().sum())