Pingouin

0.6.1 · active · verified Tue Apr 14

Pingouin is an open-source statistical package written in Python 3 and based mostly on Pandas and NumPy. It provides a comprehensive yet user-friendly set of functions for various statistical tests, including ANOVAs, correlations, regressions, Bayes Factors, effect sizes, and reliability analysis. The current stable version is 0.6.1, and the library maintains a frequent release cadence with ongoing development.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates performing an independent samples t-test and a one-way ANOVA using Pingouin. It highlights the library's ability to take raw numerical arrays or Pandas DataFrames and return rich statistical output in a DataFrame format, including T-values, p-values, degrees of freedom, effect sizes (e.g., Cohen's d), and power.

import pingouin as pg
import numpy as np
import pandas as pd

# Simulate two independent groups of data
np.random.seed(123)
data_group1 = np.random.normal(loc=10, scale=2, size=30)
data_group2 = np.random.normal(loc=12, scale=2.5, size=30)

# Perform an independent samples t-test
result = pg.ttest(data_group1, data_group2, correction='auto')

print(result)

# Example with a DataFrame for ANOVA
df_anova = pd.DataFrame({
    'dv': [10, 12, 11, 13, 15, 14, 16, 18, 17, 19, 20, 22],
    'group': ['A']*4 + ['B']*4 + ['C']*4
})
aov_result = pg.anova(data=df_anova, dv='dv', between='group')
print("\nANOVA Result:")
print(aov_result)

view raw JSON →