Causal-Learn (Causal Inference Library)
Causal-learn is a Python library for causal inference, providing various algorithms for causal discovery (identifying causal relationships from observational data) and causal effect estimation. It is part of the Py-Why project and aims to be a comprehensive toolkit for causality. The current version is 0.1.4.5, with frequent patch releases addressing bug fixes and minor enhancements.
Common errors
-
ModuleNotFoundError: No module named 'causal_learn'
cause Incorrect package name used in the import statement. The PyPI package name `causal-learn` is different from its Python import name `causallearn`.fixChange your import statements from `from causal_learn import ...` to `from causallearn import ...`. -
graphviz.backend.ExecutableNotFound: failed to execute dot; make sure the Graphviz executables are on your systems' path
cause The Graphviz command-line tools are not installed or not accessible in your system's PATH. The Python `graphviz` package is a wrapper; it requires the underlying Graphviz system tools.fixInstall Graphviz on your operating system (e.g., `sudo apt-get install graphviz` on Debian/Ubuntu, `brew install graphviz` on macOS, or download from graphviz.org for Windows) and ensure it's in your system's PATH. -
ValueError: Input data must be a numpy array.
cause A causal-learn algorithm received input data that was not a `numpy.ndarray` or was not in the expected 2D `(n_samples, n_features)` format.fixConvert your data to a `numpy.ndarray`. If starting from a pandas DataFrame `df`, use `data = df.values`. Ensure it has the correct shape.
Warnings
- gotcha The PyPI package name `causal-learn` uses a hyphen, but the Python import name is `causallearn` (no hyphen). This often leads to `ModuleNotFoundError` if `from causal_learn import ...` is used.
- gotcha Visualizing causal graphs with `GraphUtils.plot_networkx_graph` requires a system-wide installation of Graphviz in addition to the Python `graphviz` package. Without it, visualization functions will raise `ExecutableNotFound` errors.
- gotcha Input data for causal discovery algorithms (e.g., `pc`, `ges`) must typically be a 2D `numpy.ndarray` with shape `(n_samples, n_features)`. Providing other formats like pandas DataFrames directly or incorrectly shaped arrays will lead to errors.
- gotcha Causal-learn includes a wide array of algorithms, each with specific assumptions and parameter requirements. Misconfiguration of algorithms or using an inappropriate conditional independence test (e.g., `chisq` for discrete data or `gsq` for non-linear continuous data) can lead to incorrect results or runtime errors.
Install
-
pip install causal-learn
Imports
- pc
from causal_learn.search.ConstraintBased.PC import pc
from causallearn.search.ConstraintBased.PC import pc
- ges
from causal_learn.search.ScoreBased.GES import ges
from causallearn.search.ScoreBased.GES import ges
- chisq
from causallearn.utils.cit import chisq
Quickstart
import numpy as np
from causallearn.search.ConstraintBased.PC import pc
from causallearn.utils.cit import chisq
# 1. Generate synthetic data: X0 -> X1 <- X2, X0 -> X2
# X0 influences X2, and both X0 and X2 influence X1
np.random.seed(42)
N = 1000
X0 = np.random.normal(0, 1, N)
X2 = X0 * 0.5 + np.random.normal(0, 1, N)
X1 = X0 * 0.3 + X2 * 0.7 + np.random.normal(0, 1, N)
data = np.array([X0, X1, X2]).T # P-dimensional data (N samples, P variables)
# 2. Run PC algorithm for causal discovery
# data: input data matrix (N_samples, N_variables)
# alpha: significance level for conditional independence test (e.g., 0.05)
# ci_test: conditional independence test function (e.g., chisq for continuous data)
# verbose: set to True for detailed progress output
# num_cores: number of cores to use for parallel computation (-1 for all available)
causal_graph = pc(data, alpha=0.05, ci_test=chisq, verbose=False, num_cores=-1)
# 3. Print the adjacency matrix of the discovered causal graph
# An entry (i, j) == 1 indicates an edge from i to j, 0 otherwise.
print("\nDiscovered Causal Graph Adjacency Matrix:")
print(causal_graph.G.graph)