{"id":8289,"library":"llama-index-retrievers-bm25","title":"LlamaIndex BM25 Retriever","description":"This library provides the BM25Retriever integration for LlamaIndex, enabling efficient keyword-based retrieval of documents. It is part of the modular LlamaIndex ecosystem (v0.10.0+) and is released as a separate package. The current version is 0.7.1, with updates typically aligning with LlamaIndex core releases.","status":"active","version":"0.7.1","language":"en","source_language":"en","source_url":"https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/retrievers/llama-index-retrievers-bm25","tags":["LlamaIndex","retriever","BM25","keyword search","information retrieval","NLP"],"install":[{"cmd":"pip install llama-index-retrievers-bm25","lang":"bash","label":"Install package"}],"dependencies":[{"reason":"Required for core LlamaIndex functionalities like Document, Node, and Index. This package is an integration for LlamaIndex and assumes `llama-index-core` is installed.","package":"llama-index-core","optional":false}],"imports":[{"note":"The import path for BM25Retriever changed significantly with LlamaIndex v0.10.0+ and the move to modular integrations. The old path is for LlamaIndex < v0.10.0.","wrong":"from llama_index.indices.query.retrievers.bm25_retriever import BM25Retriever","symbol":"BM25Retriever","correct":"from llama_index.retrievers.bm25 import BM25Retriever"}],"quickstart":{"code":"from llama_index.retrievers.bm25 import BM25Retriever\nfrom llama_index.core import SimpleDirectoryReader, Document\nimport os\n\n# Create a dummy data directory and file for demonstration\nos.makedirs('data', exist_ok=True)\nwith open('data/test_document.txt', 'w') as f:\n    f.write('The quick brown fox jumps over the lazy dog. Dogs are often lazy.')\n    f.write('\\nCats are also animals, but they are not mentioned here.')\n\n# load documents\ndocuments = SimpleDirectoryReader(input_files=[\"data/test_document.txt\"]).load_data()\n\n# Initialize BM25 retriever directly from documents\nretriever = BM25Retriever.from_defaults(\n    documents=documents,\n    similarity_top_k=2\n)\n\n# Retrieve nodes based on a query\nnodes = retriever.retrieve(\"What animal is lazy?\")\n\nfor node in nodes:\n    print(f\"Content: {node.get_content()}\\nScore: {node.get_score()}\\n---\")\n\n# Clean up dummy file\nos.remove('data/test_document.txt')\nos.rmdir('data')\n","lang":"python","description":"This example demonstrates how to initialize `BM25Retriever` using `SimpleDirectoryReader` to load documents and then perform a retrieval query. It showcases direct initialization from a list of `Document` objects."},"warnings":[{"fix":"Ensure `llama-index-core` (or `llama-index`) is v0.10.0 or newer. If on an older version, use the monolithic `llama-index` package and its internal BM25 implementation (if available).","message":"LlamaIndex core underwent a major refactor in v0.10.0, moving integrations like BM25 into separate packages and changing core APIs. This `llama-index-retrievers-bm25` package is designed for LlamaIndex v0.10.0 and newer.","severity":"breaking","affected_versions":"<0.10.0 of llama-index core"},{"fix":"Pass `documents=my_documents` or `docstore=my_index.docstore` to `from_defaults`. Do not pass a `VectorStoreIndex` directly.","message":"The `BM25Retriever.from_defaults` method expects either a list of `Document` objects or a `Docstore` object. It cannot be directly initialized from a `VectorStoreIndex` without extracting its `docstore`.","severity":"gotcha","affected_versions":"All versions of `llama-index-retrievers-bm25`"},{"fix":"Always ensure `pip install llama-index-core` is run alongside `pip install llama-index-retrievers-bm25` to avoid `ModuleNotFoundError` for core LlamaIndex components.","message":"This package (`llama-index-retrievers-bm25`) does not automatically install `llama-index-core`. While it's an integration, `llama-index-core` is a peer dependency for almost all practical uses.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Run `pip install llama-index-retrievers-bm25`. If already installed, ensure your `llama-index-core` is v0.10.0+.","cause":"The `llama-index-retrievers-bm25` package is not installed, or you are trying to use it with an older `llama-index` core version that had a different module structure.","error":"ModuleNotFoundError: No module named 'llama_index.retrievers.bm25'"},{"fix":"Initialize with `documents=my_index.documents` (if available) or `docstore=my_index.docstore` if you have an existing index. Otherwise, pass the raw `documents` directly.","cause":"Attempting to initialize `BM25Retriever.from_defaults(index=my_index)` or similar. The `index` parameter is not directly accepted, and the example might be misleading.","error":"AttributeError: 'VectorStoreIndex' object has no attribute 'docstore'"},{"fix":"Remove the `service_context` argument. Configure components directly or pass relevant parameters (like `llm`, `embed_model`) if the component specifically accepts them.","cause":"Attempting to pass `service_context` to `from_defaults`. LlamaIndex v0.10.0+ significantly reduced the reliance on `ServiceContext` for basic component initialization.","error":"TypeError: from_defaults() got an unexpected keyword argument 'service_context'"}]}