FlashRank
FlashRank is an ultra-lite and super-fast Python library designed to add re-ranking capabilities to existing search and retrieval pipelines. Leveraging state-of-the-art LLMs and cross-encoders, it provides both pairwise (cross-encoder based) and listwise (LLM-based) re-ranking. As of version 0.2.10, FlashRank is known for its speed and efficiency, particularly on CPU, making it suitable for cost-effective serverless deployments. The library maintains an active development pace with frequent updates and bug fixes.
Common errors
-
[ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from /tmp/ms-marco-MultiBERT-L-12/flashrank-MultiBERT-L12_Q.onnx failed:Load model /tmp/ms-marco-MultiBERT-L-12/flashrank-MultiBERT-L12_Q.onnx failed. File doesn't exist.
cause The specified model file could not be found in the expected cache directory. This often happens if the model was not downloaded successfully, the cache was cleared, or the model name/path has changed or is incorrect.fixEnsure your `model_name` is correct and supported. If using a custom `cache_dir`, verify its path and permissions. Updating to a newer, explicitly named model (e.g., `model_name='ms-marco-MiniLM-L-12-v2'`) can often resolve this by forcing a fresh download. -
PydanticUndefinedAnnotation: name 'Ranker' is not defined
cause This error occurs when using `FlashrankRerank` with LangChain and the `Ranker` class from `flashrank` is not imported or not imported in the correct order (before the LangChain component that references it). Pydantic attempts to validate the `FlashrankRerank` class before `Ranker` is available in the global scope.fixMove `from flashrank.Ranker import Ranker` to appear *before* any imports or instantiations of `FlashrankRerank` (e.g., `from langchain.retrievers.document_compressors import FlashrankRerank`). -
[WinError 2] The system cannot find the file specified: '\opt'
cause This error typically occurs on Windows systems when FlashRank attempts to use a default cache directory that is usually found on Unix-like systems (like '/opt'). Windows does not have a '/opt' directory by default.fixSpecify a valid, existing directory on your Windows system for caching models using the `cache_dir` parameter during `Ranker` initialization. For example: `ranker = Ranker(model_name="ms-marco-MiniLM-L-12-v2", cache_dir='C:/Users/YourUser/FlashRankModels')`.
Warnings
- gotcha FlashRank downloads models on first use and caches them. If the default cache directory is not writable or accessible, or if a specific model path is incorrectly configured, this can lead to 'NO_SUCHFILE' errors. Ensure appropriate permissions for the cache directory, or explicitly set `cache_dir` in the Ranker initialization.
- breaking When integrating with LangChain's `FlashrankRerank` compressor, a `PydanticUndefinedAnnotation: name 'Ranker' is not defined` error can occur if `flashrank.Ranker` is not imported before `langchain.retrievers.document_compressors.FlashrankRerank`. This is due to Pydantic's validation order.
- gotcha Default models or older specified models may become unavailable or get superseded, leading to model loading failures (e.g., `ONNXRuntimeError: NO_SUCHFILE`). This often happens without explicit code changes on the user's part.
Install
-
pip install flashrank -
pip install flashrank[llm]
Imports
- Ranker
from flashrank.Ranker import Ranker
- RerankRequest
from flashrank.RerankRequest import RerankRequest
Quickstart
from flashrank.Ranker import Ranker
from flashrank.RerankRequest import RerankRequest
# Initialize the ranker with a default or specified model
# 'ms-marco-TinyBERT-L-2-v2' (~4MB) is the default and fastest.
# 'ms-marco-MiniLM-L-12-v2' (~34MB) offers better performance.
# For LLM-based rerankers, use 'rank_zephyr_7b_v1_full' (requires flashrank[llm])
ranker = Ranker(model_name="ms-marco-MiniLM-L-12-v2")
query = "What is the capital of France?"
passages = [
{"id": "1", "text": "Paris is the capital and most populous city of France."},
{"id": "2", "text": "The Eiffel Tower is in Paris."},
{"id": "3", "text": "Berlin is the capital of Germany."}
]
# Create a RerankRequest object
rerank_request = RerankRequest(query=query, passages=passages)
# Perform reranking
results = ranker.rerank(rerank_request)
# Print reranked results (sorted by score in descending order)
for result in results:
print(f"ID: {result['id']}, Text: {result['text']}, Score: {result['score']:.4f}")