ColPali Engine

0.3.15 verified Mon Apr 27 auth: no python

ColPali Engine is a library for training and running inference with the ColPali architecture, a multimodal retrieval model based on vision-language models. It supports document indexing and retrieval using late interaction over image embeddings. Current version is 0.3.15, with active development and periodic releases.

pip install colpali-engine

Common errors

error ModuleNotFoundError: No module named 'colpali_engine' ↓

cause The package has not been installed or is installed under the wrong name (e.g., 'colpali' instead of 'colpali-engine').

fix

Install via 'pip install colpali-engine' and not 'pip install colpali'.

error AttributeError: module 'colpali_engine' has no attribute 'ColPaliModel' ↓

cause Importing from the wrong top-level module; ColPaliModel is in 'colpali_engine.models'.

fix

Use 'from colpali_engine.models import ColPaliModel'.

error ValueError: The model 'vidore/colpali-v1.2' does not have a processor class ↓

cause Using an outdated model name or the processor is not compatible; ensure both model and processor are from same repo.

fix

Use 'processor = ColPaliProcessor.from_pretrained(model_name)' where model_name is from Hugging Face.

error RuntimeError: CUDA out of memory ↓

cause The model requires more GPU memory than available; batch size too large or using full precision.

fix

Reduce batch size, use 'torch_dtype=torch.float16' or 'torch.bfloat16', or offload to CPU with 'device_map="auto"'.

Warnings

breaking Version 0.3.0 removed the 'ColPali' class and renamed the model class to 'ColPaliModel'. Code using 'from colpali_engine import ColPali' will break. ↓

fix Use 'from colpali_engine.models import ColPaliModel' instead.

deprecated The method 'ColPaliModel.forward()' has been deprecated in favor of directly calling the model object (__call__) or using 'model.generate()' for generation tasks. ↓

fix Replace 'model.forward(inputs)' with 'model(inputs)' or 'model.generate(inputs)' for text generation.

gotcha GPU vs CPU: ColPali models require significant GPU memory. Running on CPU may be extremely slow. Always check device availability. ↓

fix Use 'device_map="cuda"' if GPU available, or set 'device_map="auto"' for automatic mapping.

gotcha The processor expects images in PIL format or file paths. Passing raw numpy arrays may cause errors. ↓

fix Ensure images are loaded via PIL.Image.open() or processor.image_processor.convert_to_rgb() before processing.

Imports

ColPaliModel
wrong
```
import colpali_engine
```
correct
```
from colpali_engine.models import ColPaliModel
```
Wrong: direct module import doesn't expose model class.

ColPaliProcessor

wrong

from colpali_engine.processor import ColPaliProcessor

correct

from colpali_engine.models import ColPaliProcessor

Wrong import path in older versions; processor is in models now.

ColPaliRetriever

wrong

from colpali_engine import ColPaliRetriever

correct

from colpali_engine.retrieval import ColPaliRetriever

Wrong: retriever module not top-level.

Quickstart

Basic retrieval using ColPali: load model, index documents, and search.

import torch
from colpali_engine.models import ColPaliModel, ColPaliProcessor
from colpali_engine.retrieval import ColPaliRetriever

model_name = "vidore/colpali-v1.2"
model = ColPaliModel.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="cuda" if torch.cuda.is_available() else "cpu")
processor = ColPaliProcessor.from_pretrained(model_name)

# Example documents (can be images or text)
docs = [
    "A diagram of the ColPali architecture",
    "Another document with text"
]

query = "ColPali architecture"

# Process and index documents
doc_embeddings = []
for doc in docs:
    with torch.no_grad():
        processed = processor.process_images([doc])
        embeddings = model(**processed.to(model.device))
        doc_embeddings.append(embeddings)

# Index with retriever
retriever = ColPaliRetriever(model, processor)
retriever.index(doc_embeddings)

# Search
results = retriever.search(query, k=2)
print(results)