DoWhy

0.14 · active · verified Thu Apr 16

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions, following a four-step framework: Model, Identify, Estimate, and Refute. It aims to bridge econometric and machine learning approaches to causality. The current version is 0.14, and it sees regular releases, typically every few months, incorporating new features, estimators, and compatibility updates.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates the core DoWhy workflow: defining a causal graph, identifying the effect using an estimand, estimating the effect with a specified method (e.g., linear regression), and finally refuting the estimate to check its robustness. It uses synthetic data to illustrate a simple causal model.

import dowhy
from dowhy import CausalModel
import pandas as pd
import numpy as np

# 1. Generate some sample data
np.random.seed(1)
n_samples = 100
treatment = np.random.randint(0, 2, n_samples)
confounder = np.random.normal(0, 1, n_samples)
outcome = 2 * treatment + 3 * confounder + np.random.normal(0, 1, n_samples)
data = pd.DataFrame({'treatment': treatment, 'confounder': confounder, 'outcome': outcome})

# 2. Model the causal problem
# Using a simple string-based GML representation of the graph
model=CausalModel(data=data,
                    graph="digraph { confounder -> treatment; confounder -> outcome; treatment -> outcome;}",
                    treatment=['treatment'],
                    outcome=['outcome'])

# 3. Identify a causal effect
identified_estimand = model.identify_effect(estimand_type="nonparametric-ate")

# 4. Estimate the causal effect using a statistical method
causal_estimate = model.estimate_effect(identified_estimand,
                                        method_name="backdoor.linear_regression",
                                        control_value=0,
                                        treatment_value=1)

print(f"Causal Estimate: {causal_estimate.value}")

# 5. Refute the obtained estimate
# Using a refutation method to check robustness
refutation = model.refute_estimate(identified_estimand, causal_estimate,
                                    method_name="random_common_cause")
print(f"Refutation (random common cause): {refutation.refutation_result}")

view raw JSON →