SQLite FTS4
sqlite-fts4 is a Python library providing custom SQLite functions designed for efficient full-text search (FTS4) ranking and decoding. It offers functions like `rank_score`, `rank_bm25`, `decode_matchinfo`, and `annotate_matchinfo` to enhance the utility of SQLite's built-in FTS4 extension. The library, currently at version 1.0.3, is actively maintained with a focus on stability and compatibility, addressing specific issues such as big-endian system support.
Warnings
- gotcha The `rank_score()` and `rank_bm25()` functions require specific `matchinfo` format strings ('pcx' for `rank_score`, 'pcnalx' for `rank_bm25`) to return correct results. Using an incorrect format string can lead to inaccurate scores or math domain errors.
- gotcha SQLite FTS query syntax can be complex and easily lead to errors if user-provided search strings are not carefully handled. Exposing raw FTS operators to users without validation or a custom query language can result in unexpected behavior or exceptions.
- gotcha Prior to version 0.5.2, calling `matchinfo()` without a `MATCH` clause in the query would raise an error. From version 0.5.2 onwards, this scenario now fails silently, which may alter behavior for applications relying on the previous error.
- gotcha The Python `sqlite3` module must be compiled with FTS4 support in the underlying SQLite library for `sqlite-fts4` to function. If your Python environment's SQLite is not compiled with FTS4, attempts to create FTS4 tables will result in a 'no such module: fts4' error.
Install
-
pip install sqlite-fts4 -
conda install conda-forge::sqlite-fts4
Imports
- register_functions
from sqlite_fts4 import register_functions
- rank_score
from sqlite_fts4 import rank_score
Quickstart
import sqlite3
from sqlite_fts4 import register_functions
# Connect to an in-memory SQLite database
conn = sqlite3.connect(':memory:')
# Register all custom FTS4 functions
register_functions(conn)
# Create an FTS4 virtual table
conn.execute("CREATE VIRTUAL TABLE docs USING fts4(title, body);")
# Insert some data
conn.execute("INSERT INTO docs (title, body) VALUES (?, ?)", ("Hello World", "This is a test document with some words."))
conn.execute("INSERT INTO docs (title, body) VALUES (?, ?)", ("Python SQLite FTS4", "Exploring full-text search capabilities in Python with SQLite FTS4."))
# Perform a search and rank results using rank_score
# 'pcx' is required for rank_score()
cursor = conn.execute(
"SELECT title, body, rank_score(matchinfo(docs, 'pcx')) as score FROM docs WHERE docs MATCH ? ORDER BY score DESC",
("python search",)
)
for row in cursor.fetchall():
print(f"Title: {row[0]}, Score: {row[2]:.2f}")
conn.close()