Lunr.py
Lunr.py is a Python implementation of Lunr.js, a lightweight full-text search library designed for client-side search. It enables developers to create search indexes from Python data structures, often for serialization and consumption by a JavaScript frontend. The library is actively maintained, with its current version being 0.8.0, and targets close compatibility with the original Lunr.js implementation.
Common errors
-
NameError: name 'lunr' is not defined
cause The `lunr` function, which is the main entry point for creating an index, was not correctly imported from the `lunr` package.fixEnsure you use `from lunr import lunr` at the top of your script. -
TypeError: 'builtin_function_or_method' object is not subscriptable
cause Attempting to access results from `idx.search()` as if it were a dictionary before iterating over the list of result objects, or calling `lunr` directly with parentheses, which returns the builder function itself.fixThe `lunr` function is called directly with arguments to create an index: `idx = lunr(...)`. The `idx.search()` method returns a list of dictionaries, so iterate `for result in results:` before accessing `result['ref']` etc. -
Unexpected search results (e.g., too many results or missing specific terms)
cause Misunderstanding the default OR logic of Lunr search queries or incorrect use of term presence modifiers (`+` for required, `-` for prohibited).fixBy default, Lunr searches with logical OR. To enforce required terms, prefix them with `+` (e.g., `'+term1 term2'`). To prohibit terms, prefix with `-` (e.g., `'term1 -term2'`).
Warnings
- breaking Version 0.7.0 dropped support for Python 3.6. Users on older Python versions must upgrade to Python 3.7+ or use an earlier Lunr.py version.
- gotcha The Lunr.py API is considered in 'alpha stage' and is explicitly stated as 'likely to change' by the maintainers.
- gotcha Using the optional `lunr[languages]` feature for non-English stemming relies on NLTK and currently does not guarantee full compatibility with the JavaScript Lunr.js index format or search results.
- gotcha Lunr stores its inverted index entirely in memory. For very large document corpuses, this can consume significant RAM and may require recreation or re-reading at each application startup, impacting performance.
Install
-
pip install lunr -
pip install lunr[languages]
Imports
- lunr
import lunr
from lunr import lunr
Quickstart
from lunr import lunr
documents = [
{
"id": "1",
"title": "Alice's Adventures in Wonderland",
"body": "Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, 'and what is the use of a book,' thought Alice 'without pictures or conversation?'"
},
{
"id": "2",
"title": "Through the Looking-Glass",
"body": "One thing was certain, that the white kitten had had nothing to do with it: it was the black kitten's fault entirely."
},
{
"id": "3",
"title": "The Hunting of the Snark",
"body": "'Just the place for a Snark!' the Bellman cried, As he landed his crew with a thump and a shake. 'Just the place for a Snark! I have sought it for years!'"
}
]
idx = lunr(ref='id', fields=('title', 'body'), documents=documents)
results = idx.search("Alice sister")
for result in results:
print(f"Document Ref: {result['ref']}, Score: {result['score']}")
results_exact = idx.search("+snark -alice")
print(f"\nExact search results for '+snark -alice':")
for result in results_exact:
print(f"Document Ref: {result['ref']}, Score: {result['score']}")