Berkeley Function Calling Leaderboard Evaluation

library 2026.3.23 ·python

✓ verified May 25, 2026

bfcl-eval is the Python library for the Berkeley Function Calling Leaderboard (BFCL), a benchmark to evaluate Large Language Models (LLMs) on their ability to perform function calling. It provides the evaluation pipeline and datasets, including support for multi-step and multi-turn function calls as of its V3 release. The library is actively maintained with frequent updates, with its current PyPI version being 2026.3.23.