conKurrence

AI evaluation toolkit â measure inter-rater agreement (Fleiss' Îº, Kendall's W) across multiple LLM providers

Install

npx conkurrence

Tools · 7

conkurrence_run Execute an evaluation across multiple AI raters
conkurrence_report Generate a detailed markdown report
conkurrence_compare Side-by-side comparison of two runs
conkurrence_trend Track agreement over multiple runs
conkurrence_suggest AI-powered schema suggestion from your data
conkurrence_validate_schema Validate a schema before running
conkurrence_estimate Estimate cost and token usage

Links

githubgithub.com/AlligatorC0der/conkurrence ↗