{"id":7149,"library":"decaf-synthetic-data","title":"DECAF Synthetic Data","description":"DECAF (DEbiasing CAusal Fairness) is a Python library providing tools for generating synthetic data and debiasing causal effects. It implements methods to create synthetic datasets that capture complex causal relationships while mitigating various forms of bias, enabling researchers and practitioners to evaluate and develop fair causal inference models. Currently at version 0.1.7, the library is under active development with a focus on research-driven advancements.","status":"active","version":"0.1.7","language":"en","source_language":"en","source_url":"https://github.com/trentkyono/DECAF","tags":["synthetic data","causal inference","fairness","debiasing","machine learning","research"],"install":[{"cmd":"pip install decaf-synthetic-data","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"symbol":"DECAF","correct":"from decaf import DECAF"},{"symbol":"SyntheticData","correct":"from decaf.synthetic_data import SyntheticData"}],"quickstart":{"code":"import numpy as np\nfrom decaf import DECAF\nfrom decaf.synthetic_data import SyntheticData\n\n# 1. Generate initial synthetic data with a known structure\nn = 1000  # Number of samples\np = 10    # Number of features\nseed = 42\nsd = SyntheticData(n=n, p=p, seed=seed)\ndata = sd.generate_data() # Returns a dictionary with 'x', 'a', 'y'\n\nX_orig = data['x'] # Features\nA_orig = data['a'] # Treatment\nY_orig = data['y'] # Outcome\n\nprint(f\"Original X shape: {X_orig.shape}, A shape: {A_orig.shape}, Y shape: {Y_orig.shape}\")\n\n# 2. Initialize and train the DECAF model\n# (using a small number of epochs for quick demonstration)\nmodel = DECAF(X_orig, A_orig, Y_orig, epochs=10, verbose=False, seed=seed)\nmodel.train()\n\n# 3. Generate new synthetic data using the trained DECAF model\nn_synthetic = 500\nsynthetic_X, synthetic_A = model.generate_synthetic_data(n_samples=n_synthetic)\n\nprint(f\"Synthetic X shape: {synthetic_X.shape}, Synthetic A shape: {synthetic_A.shape}\")\n# Further steps would involve evaluating fairness or causal effects on this synthetic data","lang":"python","description":"This quickstart demonstrates how to use `SyntheticData` to generate a base dataset, then how to initialize and train the `DECAF` model with this data, and finally, generate new synthetic samples from the trained model. This workflow is typical for evaluating debiasing strategies."},"warnings":[{"fix":"Always check the latest GitHub README and release notes (if any) before upgrading to new versions, and adapt your code as necessary.","message":"As the library is in early development (version 0.1.x), expect potential API changes, breaking modifications, and new features in minor or patch releases.","severity":"breaking","affected_versions":"<1.0.0"},{"fix":"Ensure `X`, `A`, and `Y` are NumPy arrays with appropriate dimensions (e.g., X as 2D, A and Y as 1D or 2D with one column). Review the quickstart for expected input formats.","message":"The `DECAF` model expects specific input formats (NumPy arrays) for features (X), treatment (A), and outcome (Y). Mismatched shapes or types can lead to errors during model initialization or training.","severity":"gotcha","affected_versions":"All"},{"fix":"Start with smaller datasets and fewer epochs to test your setup. Monitor resource usage (CPU/GPU, RAM) and consider optimizing hyperparameters or utilizing more powerful hardware for production-scale tasks.","message":"Training `DECAF` models, especially on larger datasets or with many epochs, can be computationally intensive and require significant memory. Default parameters might not be optimized for all environments.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure you have installed the correct package: `pip install decaf-synthetic-data`.","cause":"The library package name on PyPI is `decaf-synthetic-data`, but the primary import is `decaf`.","error":"ModuleNotFoundError: No module named 'decaf'"},{"fix":"Verify that your input arrays have compatible dimensions. For example, `X` should typically be `(n_samples, n_features)`, while `A` and `Y` could be `(n_samples,)` or `(n_samples, 1)`.","cause":"Input arrays (X, A, Y) passed to the `DECAF` model have incompatible shapes, often due to incorrect reshaping or concatenation.","error":"ValueError: operands could not be broadcast together with shapes (X,) (Y,)"},{"fix":"Check the official documentation or the `decaf/__init__.py` source code to confirm method names and their availability. The correct method to generate synthetic data from a trained model is `model.generate_synthetic_data()`.","cause":"You might be attempting to use a method that doesn't exist or is not available on the `DECAF` instance, possibly due to a typo or misunderstanding of the API.","error":"AttributeError: 'DECAF' object has no attribute 'generate_synthetic_data'"}]}