PyMC
PyMC (formerly PyMC3) is a Python package for Bayesian statistical modeling and probabilistic machine learning. It focuses on advanced Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms, offering flexibility and extensibility for a wide range of problems. It is currently at version 5.28.4 and maintains a frequent release cadence, with minor updates and bug fixes released regularly.
Warnings
- breaking PyMC v5.26.0 dropped support for Python 3.10 and NumPy versions older than 2.0. Ensure your Python environment is 3.11 or newer and NumPy is 2.0 or newer.
- breaking PyMC was renamed from `PyMC3` to `PyMC` with the release of version 4.0. All `import pymc3 as pm` statements must be updated to `import pymc as pm`.
- breaking With PyMC v5.27.0, the `VarName` class was replaced by Python's built-in `str`. Code relying on `VarName` objects for variable introspection or manipulation may need adjustment.
- breaking From PyMC v5.26.0, `Model.compile_logp` now expects *all* model variables as input, not just a subset of the logp terms. This changes the API for functions that infer graph inputs.
- gotcha PyMC relies on PyTensor's ability to compile C implementations for performance. If `g++` (or an equivalent C++ compiler) is not detected in your environment, PyTensor will fall back to slower Python implementations, severely degrading performance.
- gotcha Since PyMC3 version 3.9 (and all PyMC v4+), `pm.sample()` defaults to returning an `arviz.InferenceData` object instead of a `MultiTrace` object. Ensure your analysis code is adapted for `InferenceData`.
Install
-
pip install pymc -
conda install -c conda-forge pymc
Imports
- pm
import pymc as pm
- Model
from pymc import Model
- Normal
from pymc import Normal
Quickstart
import numpy as np
import pymc as pm
import arviz as az
# 1. Simulate some data
np.random.seed(42)
size = 100
alpha_true, beta_true = 1, [1, 2.5]
sigma_true = 1
X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2
Y = alpha_true + beta_true[0] * X1 + beta_true[1] * X2 + np.random.normal(size=size) * sigma_true
# 2. Define the PyMC model
with pm.Model() as linear_model:
# Priors for unknown model parameters
alpha = pm.Normal('alpha', mu=0, sigma=10)
beta = pm.Normal('beta', mu=0, sigma=10, shape=2)
sigma = pm.HalfNormal('sigma', sigma=1)
# Expected value of outcome
mu = alpha + beta[0] * X1 + beta[1] * X2
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=mu, sigma=sigma, observed=Y)
# 3. Perform MCMC sampling
idata = pm.sample(draws=1000, tune=1000, cores=2, random_seed=42)
# 4. Analyze results (e.g., print summary)
print(az.summary(idata, var_names=['alpha', 'beta', 'sigma']))
# To visualize, you would typically use:
# import matplotlib.pyplot as plt
# az.plot_trace(idata)
# plt.show()