Statistical computation and diagnostics for ArviZ
arviz-stats is a Python package that provides statistical functions and diagnostics for the exploratory analysis of Bayesian models. It is a subpackage of the broader ArviZ library (along with arviz-base and arviz-plots) and focuses specifically on computational and numerical features like statistical summaries, diagnostics, and model comparison. The current version is 1.0.0, released on March 2, 2026. The ArviZ ecosystem, including arviz-stats, has a regular release cadence with several releases in the past year.
Common errors
-
AttributeError: 'DataTree' object has no attribute 'to_dataset' (or similar error when treating DataTree as Dataset)
cause In ArviZ 1.0+, `InferenceData` groups are now `xarray.DataTree` objects, not `xarray.Dataset` objects directly.fixIf you need `Dataset` specific functionality, explicitly convert or access the dataset view: `my_datatree_group.to_dataset()` or `my_datatree_group.dataset`. -
Found several log likelihood arrays var_name cannot be None
cause This error typically occurs during model comparison or refitting when the `log_likelihood` group within your `InferenceData` has ambiguous or inconsistent dimensions across variables, preventing `xarray` from correctly broadcasting.fixEnsure that the `log_likelihood` group variables have well-defined and consistent dimension names (e.g., `chain`, `draw`, `obs_id`) and shapes that align with the `InferenceData` schema. Review how `log_likelihood` is created during inference. -
RuntimeWarning: invalid value encountered in [...] (e.g., in diagnostics.py or related statistical computations)
cause Indicates numerical instability or issues within the sampled data, such as `NaN` values, extreme numbers, or poorly converging chains, leading to undefined statistical calculations.fixInspect your model's sampling diagnostics (`rhat`, `ess`). Consider re-parameterizing the model, increasing `target_accept` in your PPL's sampler, or filtering out divergent transitions. Ensure your input data to `arviz-stats` functions is clean and doesn't contain unexpected `NaN`s. -
ModuleNotFoundError: No module named 'xarray' (or similar error when using InferenceData features after minimal install)
cause `arviz-stats` was installed with the minimal dependency set, which does not include `xarray`, but you are attempting to use functionality that relies on `xarray` and `InferenceData` objects.fixUninstall the minimal `arviz-stats` and reinstall it with the recommended `xarray` optional dependency: `pip uninstall arviz-stats && pip install "arviz-stats[xarray]"`.
Warnings
- breaking With ArviZ 1.0 (and thus arviz-stats 1.0), the `arviz.InferenceData` object has been replaced by `xarray.DataTree` in `arviz-base`. Direct access like `dt["group"]` will now return a `DataTree` instead of an `xarray.Dataset`.
- breaking The default credible interval probability (`ci_prob`) for functions like `summary` and `hdi` has changed from 0.94 to 0.89. Additionally, a new `ci_kind` setting (defaulting to "eti" for equal-tailed interval) has been introduced.
- gotcha Installing `arviz-stats` without the `[xarray]` extra (`pip install arviz-stats`) limits it to a low-level array-only interface, primarily for developers. Many features, especially those that process `InferenceData` objects, will be unavailable and raise errors.
- gotcha Using `rounding="auto"` (the default) in functions like `azs.summary()` is intended for display purposes. The output values are converted to strings, which can lead to issues if you intend to perform further numerical computations on the results.
Install
-
pip install "arviz-stats[xarray]" -
pip install arviz-stats
Imports
- azs
import arviz_stats as azs
- azb
import arviz_base as azb
- az
import arviz as az
Quickstart
import arviz_stats as azs
import arviz_base as azb
import numpy as np
# Load example data (InferenceData object)
data = azb.load_arviz_data("centered_eight")
# Compute summary statistics
print("\n--- Summary Statistics ---")
summary_df = azs.summary(data, var_names=["mu", "tau"])
print(summary_df)
# Compute a specific metric, e.g., Root Mean Square Error (RMSE)
print("\n--- RMSE Metric ---")
# For metrics, ensure you have posterior_predictive and observed_data in your InferenceData
# For simplicity here, we'll demonstrate a direct array calculation if data was available
# In a real scenario, `data` should have `posterior_predictive` and `observed_data` groups.
# Example: azs.metrics(data, kind="rmse")
# Let's simulate a simple array for mode calculation as per docs for demonstration
rand_data = np.random.normal(loc=5, scale=2, size=1000)
mode_val = azs.mode(rand_data)
print(f"Mode of simulated data: {mode_val}")