CausalML

raw JSON →
0.16.0 verified Fri May 01 auth: no python

CausalML is a Python package for uplift modeling and causal inference with machine learning algorithms. It provides a variety of methods for causal inference in both experimental and observational settings, including meta-learners (S-Learner, T-Learner, X-Learner, R-Learner), tree-based methods (Causal Forest, Uplift Random Forest), and deep learning models. Current version: 0.16.0. Release cadence: irregular, with major updates approximately annually.

pip install causalml
error ImportError: cannot import name 'SLearner' from 'causalml.meta'
cause Wrong import path after module restructuring in v0.12.
fix
Use from causalml.inference.meta import SLearner
error AttributeError: module 'causalml' has no attribute 'dataset'
cause CausalML's dataset module may not be installed if using a slim install (without optional dependencies).
fix
Install with all extras: pip install causalml[all], or ensure pandas and scikit-learn are present.
error ValueError: The 'treatment' column must be binary. Got 3 unique values.
cause Treatment column passed to a meta-learner contains more than two distinct values (e.g., 0,1,2). Uplift models generally support binary treatment.
fix
Encode treatment as 0 (control) and 1 (treatment). Drop or merge other groups.
error TypeError: __init__() got an unexpected keyword argument 'learner'
cause Using a string for the learner parameter, which was removed in v0.14.
fix
Pass a model instance: e.g., from sklearn.ensemble import RandomForestClassifier; SLearner(learner=RandomForestClassifier())
breaking In v0.14, the causalml.inference.meta module was restructured. BaseLearner subclasses like SLearner, TLearner, XLearner no longer accept string params for model constructors; pass model objects directly.
fix Use `from sklearn.linear_model import LogisticRegression; SLearner(learner=LogisticRegression())` instead of `SLearner(learner='lr')`
breaking The `causalml.inference.tree` module was removed in v0.15. Uplift tree models are now in `causalml.inference.forest` or should be imported from the forest module.
fix Use `from causalml.inference.forest import UpliftRandomForestClassifier` instead of any import from `causalml.inference.tree`
deprecated The function `causalml.dataset.synthetic_data` is deprecated in favor of `causalml.dataset.make_uplift_classification`.
fix Replace `from causalml.dataset import synthetic_data` with `from causalml.dataset import make_uplift_classification`
gotcha CausalML's feature importance methods (e.g., `plot_importance()`) may produce different scales than sklearn's built-in importance. Interpretation is relative, not absolute.
fix Use importance values for ranking features only, not for hypothesis testing.
gotcha When using S-Learner or T-Learner with categorical features, ensure you one-hot encode them first. The underlying models may not handle cat codes correctly.
fix Use `pandas.get_dummies()` or `sklearn.preprocessing.OneHotEncoder` before fitting.
deprecated The `causalml.inference.meta.USM` (Uplift S-Model) class is deprecated as of v0.16. Use `SClassifier` or `SRegressor` instead.
fix Replace `from causalml.inference.meta import USM` with `from causalml.inference.meta import SClassifier`
pip install causalml[all]

Generate synthetic data and train a simple meta-learner to estimate uplift.

import pandas as pd
import numpy as np
from causalml.dataset import make_uplift_classification
from causalml.inference.meta import LRSLearner

# Generate synthetic uplift data
np.random.seed(42)
df, x_names = make_uplift_classification(n_samples=1000, treatment_name=['treatment', 'control'], y_name='y', random_seed=42)

# Train S-learner with logistic regression
learner = LRSLearner()
learner.fit(df[x_names].values, df['treatment'].values, df['y'].values)

# Predict uplift for each sample
uplift = learner.predict(df[x_names].values)
uplift[:5]