CLIP Benchmark
CLIP-benchmark is a Python library designed to evaluate CLIP-like models on a standard set of datasets for various tasks, including zero-shot classification, zero-shot retrieval, linear probing, and captioning. It supports models like OpenCLIP, Japanese CLIP, and NLLB CLIP, and integrates with datasets from torchvision, TensorFlow datasets, and VTAB. The library is currently active, with version 1.6.2, and focuses on reproducible evaluation results.
Common errors
-
command 'clip_benchmark' not found
cause The `clip-benchmark` command-line interface tool is not found in your system's PATH. This usually happens if pip's script directory isn't included in PATH, or if the installation was incomplete.fixEnsure `clip-benchmark` is installed correctly (`pip install clip-benchmark`) and that your shell's PATH includes the directory where pip installs executables (e.g., `~/.local/bin` on Linux/macOS or `Scripts` folder in Python install dir on Windows). -
ModuleNotFoundError: No module named 'open_clip_torch'
cause You are attempting to benchmark an OpenCLIP model, but the `open_clip_torch` package, which provides the OpenCLIP implementation, is not installed.fixInstall the `open_clip_torch` dependency: `pip install open_clip_torch`. -
Failed to load dataset 'tfds/imagenet_v2': module 'tensorflow_datasets' not found
cause You are trying to use a TensorFlow dataset (e.g., `tfds/imagenet_v2`) but the `tensorflow-datasets` library is not installed.fixInstall the required package: `pip install tensorflow-datasets tfds-nightly timm`.
Warnings
- gotcha The `--dataset_root` and `--output` arguments support templating (e.g., `wds_{dataset_cleaned}`). Ensure you understand the templating syntax when specifying paths to avoid unexpected file locations or dataset loading issues.
- gotcha When working with WebDatasets, the conversion process may require specific tools (e.g., `webdataset` utilities) and manual uploading to platforms like Hugging Face Hub, which is not fully automated by the library itself.
- gotcha Some dataset types (e.g., TensorFlow Datasets, VTAB) require additional installations beyond `clip-benchmark` itself. For instance, TensorFlow Datasets may need `tfds-nightly` and `timm`, while VTAB requires its dedicated package.
- gotcha When adding support for new custom CLIP models, you must define a specific model loading function and integrate it into `clip_benchmark/models/__init__.py`'s `TYPE2FUNC` mapping. This requires understanding the internal model loading mechanism.
Install
-
pip install clip-benchmark
Imports
- run_benchmark
from clip_benchmark.cli import run_benchmark
Quickstart
clip_benchmark eval --dataset=cifar10 --task=zeroshot_classification --pretrained=laion400m_e32 --model=ViT-B-32-quickgelu --output=result.json --batch_size=64