LM Evaluation Harness
JSON →LM Evaluation Harness (lm-eval) is a comprehensive framework for evaluating language models on a wide range of benchmarks and tasks. It supports various model backends (HuggingFace, vLLM, SGLang, etc.) and provides a standardized way to compare model performance. The current version is 0.4.11, and it maintains a rapid release cadence with frequent minor updates and occasional breaking changes.
Traffic · last 30 days ↑57% vs prev 7d
total hits 21
actors 7 distinct systems
last hit 14h ago ByteDance
top countries 🇸🇬 Singapore · 🇺🇸 United States · 🇫🇷 France · 🇩🇪 Germany · 🇮🇳 India
API endpoints
full doc /v1/registry/lm-eval
install /v1/registry/lm-eval/install
compatibility /v1/registry/lm-eval/compatibility