{"id":9571,"library":"causalmodels","title":"Causalmodels","description":"Causalmodels is a Python library for defining, analyzing, and inferring causal relationships from data, drawing inspiration from Judea Pearl's do-calculus. It provides tools for building Bayesian causal models, performing matching, and conducting regression-based causal inference. The current version is 0.4.0, with an irregular release cadence.","status":"active","version":"0.4.0","language":"en","source_language":"en","source_url":"https://github.com/roronya/causalmodels","tags":["causal inference","bayesian models","treatment effect","econometrics","statistics"],"install":[{"cmd":"pip install causalmodels","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Numerical operations","package":"numpy"},{"reason":"Scientific computing","package":"scipy"},{"reason":"Data manipulation","package":"pandas"},{"reason":"Statistical modeling and estimation","package":"statsmodels"},{"reason":"Machine learning utilities and models","package":"scikit-learn"}],"imports":[{"symbol":"BayesianModel","correct":"from causalmodels.bayesian_model import BayesianModel"},{"symbol":"Matching","correct":"from causalmodels.matching import Matching"},{"symbol":"Regression","correct":"from causalmodels.regression import Regression"},{"symbol":"CausalDAG","correct":"from causalmodels.causal_dag import CausalDAG"},{"symbol":"CausalInference","correct":"from causalmodels.inference import CausalInference"}],"quickstart":{"code":"import pandas as pd\nimport numpy as np\nfrom causalmodels.regression import Regression\n\n# Simulate some data with a known causal effect\nnp.random.seed(42)\nn_samples = 1000\n\n# Confounder Z affects both Treatment X and Outcome Y\nZ = np.random.normal(0, 1, n_samples)\n# Treatment X is affected by Z\nX = 0.5 * Z + np.random.normal(0, 1, n_samples)\n# Outcome Y is affected by X and Z\nY = 2.0 * X + 1.0 * Z + np.random.normal(0, 1, n_samples)\n\ndata = pd.DataFrame({'Z': Z, 'X': X, 'Y': Y})\n\n# Initialize the Regression model\n# X: treatment variable, Y: outcome variable, control_variables: confounders\nmodel = Regression(data, treatment='X', outcome='Y', control_variables=['Z'])\n\n# Estimate the Average Treatment Effect (ATE)\nate_estimate = model.estimate_ate()\n\nprint(f\"Observed data with N={n_samples} samples.\")\nprint(f\"Estimated Average Treatment Effect (ATE) of X on Y, controlling for Z: {ate_estimate:.4f}\")","lang":"python","description":"This quickstart demonstrates how to use the `Regression` module to estimate the Average Treatment Effect (ATE) of a treatment variable 'X' on an outcome 'Y', while controlling for a confounder 'Z'. It simulates data reflecting a causal relationship and then applies the regression model."},"warnings":[{"fix":"Carefully define your causal graph, identify all potential confounders, and consider domain knowledge before applying methods. Validate assumptions where possible through sensitivity analysis or alternative approaches.","message":"Causal inference methods in `causalmodels` (and generally) rely on strong assumptions (e.g., no unmeasured confounders, correct specification of the causal graph). Failing to meet these assumptions can lead to biased estimates.","severity":"gotcha","affected_versions":"All"},{"fix":"Ensure your DataFrame is clean, handles missing values (e.g., imputation or removal) appropriately, and all columns used in causal models are of the expected numerical type before passing the data.","message":"Input data to `causalmodels` methods must be clean and appropriately preprocessed. Missing values, incorrect data types, or inconsistent column names can lead to errors or silently biased results during estimation.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Double-check the spelling of column names against `df.columns` to ensure they exactly match the DataFrame columns, respecting case sensitivity.","cause":"The column name provided for `treatment`, `outcome`, or `control_variables` does not exist in the input DataFrame.","error":"KeyError: \"['NonExistentColumn']\" or similar error indicating a missing column."},{"fix":"Convert all relevant columns to appropriate numeric types (e.g., `float`, `int`) using `df['column'] = pd.to_numeric(df['column'], errors='coerce')` before passing the DataFrame to `causalmodels` methods.","cause":"One or more columns intended for numerical calculation (treatment, outcome, control) contain non-numeric data (e.g., strings, objects) that cannot be directly processed by the underlying statistical models.","error":"TypeError: unsupported operand type(s) for +: 'str' and 'int' or similar numerical calculation error."}]}