CausalML
raw JSON → 0.16.0 verified Fri May 01 auth: no python
CausalML is a Python package for uplift modeling and causal inference with machine learning algorithms. It provides a variety of methods for causal inference in both experimental and observational settings, including meta-learners (S-Learner, T-Learner, X-Learner, R-Learner), tree-based methods (Causal Forest, Uplift Random Forest), and deep learning models. Current version: 0.16.0. Release cadence: irregular, with major updates approximately annually.
pip install causalml Common errors
error ImportError: cannot import name 'SLearner' from 'causalml.meta' ↓
cause Wrong import path after module restructuring in v0.12.
fix
Use
from causalml.inference.meta import SLearner error AttributeError: module 'causalml' has no attribute 'dataset' ↓
cause CausalML's dataset module may not be installed if using a slim install (without optional dependencies).
fix
Install with all extras:
pip install causalml[all], or ensure pandas and scikit-learn are present. error ValueError: The 'treatment' column must be binary. Got 3 unique values. ↓
cause Treatment column passed to a meta-learner contains more than two distinct values (e.g., 0,1,2). Uplift models generally support binary treatment.
fix
Encode treatment as 0 (control) and 1 (treatment). Drop or merge other groups.
error TypeError: __init__() got an unexpected keyword argument 'learner' ↓
cause Using a string for the learner parameter, which was removed in v0.14.
fix
Pass a model instance: e.g.,
from sklearn.ensemble import RandomForestClassifier; SLearner(learner=RandomForestClassifier()) Warnings
breaking In v0.14, the causalml.inference.meta module was restructured. BaseLearner subclasses like SLearner, TLearner, XLearner no longer accept string params for model constructors; pass model objects directly. ↓
fix Use `from sklearn.linear_model import LogisticRegression; SLearner(learner=LogisticRegression())` instead of `SLearner(learner='lr')`
breaking The `causalml.inference.tree` module was removed in v0.15. Uplift tree models are now in `causalml.inference.forest` or should be imported from the forest module. ↓
fix Use `from causalml.inference.forest import UpliftRandomForestClassifier` instead of any import from `causalml.inference.tree`
deprecated The function `causalml.dataset.synthetic_data` is deprecated in favor of `causalml.dataset.make_uplift_classification`. ↓
fix Replace `from causalml.dataset import synthetic_data` with `from causalml.dataset import make_uplift_classification`
gotcha CausalML's feature importance methods (e.g., `plot_importance()`) may produce different scales than sklearn's built-in importance. Interpretation is relative, not absolute. ↓
fix Use importance values for ranking features only, not for hypothesis testing.
gotcha When using S-Learner or T-Learner with categorical features, ensure you one-hot encode them first. The underlying models may not handle cat codes correctly. ↓
fix Use `pandas.get_dummies()` or `sklearn.preprocessing.OneHotEncoder` before fitting.
deprecated The `causalml.inference.meta.USM` (Uplift S-Model) class is deprecated as of v0.16. Use `SClassifier` or `SRegressor` instead. ↓
fix Replace `from causalml.inference.meta import USM` with `from causalml.inference.meta import SClassifier`
Install
pip install causalml[all] Imports
- SLearner wrong
from causalml.meta import SLearnercorrectfrom causalml.inference.meta import SLearner - CausalForest wrong
from causalml.forest import CausalForestcorrectfrom causalml.inference.forest import CausalForest - set_rfub wrong
from causalml.ufb import set_rfubcorrectfrom causalml.propensity import set_rfub
Quickstart
import pandas as pd
import numpy as np
from causalml.dataset import make_uplift_classification
from causalml.inference.meta import LRSLearner
# Generate synthetic uplift data
np.random.seed(42)
df, x_names = make_uplift_classification(n_samples=1000, treatment_name=['treatment', 'control'], y_name='y', random_seed=42)
# Train S-learner with logistic regression
learner = LRSLearner()
learner.fit(df[x_names].values, df['treatment'].values, df['y'].values)
# Predict uplift for each sample
uplift = learner.predict(df[x_names].values)
uplift[:5]