Phi_K correlation analyzer library

0.12.5 · active · verified Sun Apr 12

Phi_K is a practical correlation constant that works consistently between categorical, ordinal, and interval variables. It extends Pearson's hypothesis test of independence, capturing non-linear dependencies and reverting to Pearson's correlation for bi-variate normal distributions. The current version, 0.12.5, was released in July 2025. The library aims for a regular release cadence, with updates occurring every few months to a year, incorporating Python version support and bug fixes.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to load a sample dataset, calculate the Phi_K correlation matrix, and the corresponding significance matrix. It also shows how to generate a comprehensive correlation report (commented out as it requires a local PDF save and matplotlib).

import pandas as pd
import phik
from phik import resources, report

# Load example data
df = pd.read_csv(resources.fixture('fake_insurance_data.csv.gz'))

# Calculate the phi_k correlation matrix
phik_corr = df.phik_matrix()
print(phik_corr.head())

# Calculate the significance matrix
significance_matrix = df.significance_matrix()
print(significance_matrix.head())

# Generate and save a correlation report (requires matplotlib)
# report.correlation_report(df, pdf_file_name='phik_report.pdf')

view raw JSON →