emcee
emcee is an MIT licensed pure-Python implementation of Goodman & Weare's Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler. It is a widely used toolkit for Bayesian parameter estimation in scientific fields, particularly astronomy, and maintains an active release cadence with minor updates and bug fixes. [3, 6, 9]
Warnings
- breaking When upgrading from `emcee` v2.x to v3.x, several arguments to `EnsembleSampler` related to proposal control (`a`, `live_dangerously`) and parallelization (`threads`) were deprecated. These functionalities are now managed via the `moves` interface and the `pool` argument, respectively. [8]
- gotcha The `log_prob_fn` passed to `EnsembleSampler` must return the natural logarithm of the *posterior probability*, not just the likelihood. It should also return `-np.inf` if the parameters are unphysical or lead to a probability of zero. [2, 11]
- gotcha Poor initialization of walkers can lead to slow convergence, biased results, or errors (e.g., 'Too few points to create valid contours' or math warnings). Walkers should be initialized in a region of non-zero probability. [15, 17, 18]
- gotcha Specific versions of `emcee` have included compatibility fixes for `numpy` and `scipy`. For example, v3.1.6 fixed compatibility with older NumPy versions, and v3.1.4 addressed the updated `kstest` interface in SciPy 1.10. [12]
Install
-
pip install emcee
Imports
- emcee
import emcee
- EnsembleSampler
from emcee import EnsembleSampler
Quickstart
import numpy as np
import emcee
# Define the logarithm of the posterior probability density function
def log_prob(x, mu, cov):
diff = x - mu
return -0.5 * np.dot(diff, np.linalg.solve(cov, diff))
# Set up the problem dimensions and parameters
ndim = 2 # Number of dimensions
nwalkers = 32 # Number of MCMC walkers
# True mean and covariance for the Gaussian
np.random.seed(42)
mu_true = np.array([0.5, -0.2])
cov_true = np.array([[1.0, 0.5], [0.5, 1.5]])
# Initialize walkers in a small ball around the true mean
p0 = mu_true + 1e-3 * np.random.randn(nwalkers, ndim)
# Instantiate the sampler
sampler = emcee.EnsembleSampler(nwalkers, ndim, log_prob, args=(mu_true, cov_true))
# Run the MCMC production chain
state = sampler.run_mcmc(p0, 100)
# After burn-in, reset and run for more steps
sampler.reset()
state = sampler.run_mcmc(state, 1000)
# Get the chain of samples
samples = sampler.get_chain(flat=True)
print(f"Mean acceptance fraction: {np.mean(sampler.acceptance_fraction):.3f}")
print(f"First 5 samples:\n{samples[:5]}")