powershap
raw JSON → 0.1.0.1 verified Fri May 01 auth: no python
Powerful feature selection using statistical significance of SHAP values. Current version: 0.1.0.1. Active development, major releases every few months.
pip install powershap Common errors
error ModuleNotFoundError: No module named 'powershap' ↓
cause The package is not installed or installed in a different environment.
fix
Run 'pip install powershap' in the correct Python environment.
error ImportError: cannot import name 'PowerShap' from 'powershap' ↓
cause The import path is wrong or the installed version is very old (pre-0.0.2).
fix
Ensure you have version >=0.0.2 and use 'from powershap import PowerShap'.
error ValueError: The model must be a fitted estimator or a classifier/regressor. ↓
cause The provided model is not an estimator or is not fitted.
fix
Pass an unfitted estimator (e.g., RandomForestClassifier()) to PowerShap; it will be fitted internally.
Warnings
gotcha The 'model' parameter can be a scikit-learn estimator or a pipeline. When using a pipeline, ensure the final step is an estimator. ↓
fix Use an estimator with .fit() and .predict() or .predict_proba() methods.
gotcha PowerShap can be memory-intensive for high-dimensional data because it computes SHAP values for all features. ↓
fix Reduce the number of features or use a smaller sample size via the 'sample' parameter.
deprecated The 'power_analysis' parameter previously accepted 'power' or 'min_power'; now 'auto' is recommended as it estimates sample size automatically. ↓
fix Use power_analysis='auto' to avoid manual tuning.
Imports
- PowerShap
from powershap import PowerShap
Quickstart
from powershap import PowerShap
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
import pandas as pd
X, y = make_classification(n_samples=100, n_features=20, random_state=42)
X = pd.DataFrame(X, columns=[f'feature_{i}' for i in range(X.shape[1])])
selector = PowerShap(
model=RandomForestClassifier(),
power_analysis='auto', # automatic estimation
cv=5,
random_state=42
)
selector.fit(X, y)
print('Selected features:', selector.transform(X).columns.tolist())