Empirical Calibration

0.12 · active · verified Thu Apr 16

Empirical Calibration (EC) is a Python library (version 0.12) designed for correcting bias in data samples using generic weighting methods. It formulates the calibration problem as a convex optimization, solved efficiently in a dual form, and aims to reduce data biases in various statistical fields, such as survey sampling and causal studies with observational data. The library is actively maintained, with the latest release in May 2024 and ongoing development on GitHub.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `empirical_calibration` to compute sample weights. It simulates two sets of covariates: `covariates_sample` representing your biased data and `target_covariates` representing the desired distribution (e.g., from a population). The `maybe_exact_calibrate` function then calculates weights for the sample data such that its weighted covariate distribution matches the target distribution as closely as possible, using the specified optimization objective (here, `ENTROPY`).

import numpy as np
import pandas as pd
import empirical_calibration as ec

# Create dummy covariate dataframes for demonstration
# In a real scenario, these would come from your biased sample and target population
covariates_sample = pd.DataFrame({
    'sex': np.random.choice([0, 1], size=100),
    'age': np.random.randint(18, 65, size=100)
})
target_covariates = pd.DataFrame({
    'sex': np.random.choice([0, 1], size=1000),
    'age': np.random.randint(18, 65, size=1000)
})

# Apply empirical calibration to compute weights
# Using ENTROPY objective as a common choice
try:
    weights, _ = ec.maybe_exact_calibrate(
        covariates=covariates_sample,
        target_covariates=target_covariates,
        objective=ec.Objective.ENTROPY
    )
    print(f"Successfully computed weights. First 5 weights: {weights[:5]}")
    print(f"Sum of weights: {np.sum(weights):.2f}")
except ec.ConvergenceError as e:
    print(f"Calibration did not converge: {e}")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

view raw JSON →