SQLite-vec: Vector Search Extension for SQLite
sqlite-vec is a SQLite extension for vector search, written in pure C with no dependencies. It enables storing, manipulating, and querying vector data directly within SQLite files, making it ideal for edge deployments, serverless functions, and local tooling. It supports various vector types (float32, int8, bit) and distance metrics (L1, L2, cosine, Hamming), offering fast brute-force search and SIMD acceleration. The project is pre-v1, so breaking changes are expected. It supports Python, Node.js, Ruby, Rust, and Go bindings.
Warnings
- breaking sqlite-vec is pre-v1, meaning its API and behavior are subject to breaking changes in future releases.
- gotcha SQLite version 3.41 or higher is recommended for full feature compatibility and optimal performance, though it will work with older versions.
- gotcha Loading SQLite extensions requires `db.enable_load_extension(True)`, which can be a security risk if not immediately followed by `db.enable_load_extension(False)` after the extension is loaded.
- gotcha Current versions of sqlite-vec primarily use brute-force search, which may become slow on very large datasets (>1 million vectors with high dimensions). Approximate Nearest Neighbors (ANN) support is planned but not yet a core feature.
- gotcha Older versions (prior to v0.2.0-alpha) had known memory leak issues, particularly during DELETE operations.
Install
-
pip install sqlite-vec
Imports
- load
from sqlite_vec import load
- serialize_float32
from sqlite_vec import serialize_float32
Quickstart
import sqlite3
from sqlite_vec import load, serialize_float32
import numpy as np # Often used for embeddings
import os
# Connect to an in-memory SQLite database
db = sqlite3.connect(":memory:")
# Enable loading of SQLite extensions (necessary for sqlite-vec)
db.enable_load_extension(True)
# Load the sqlite-vec extension
load(db)
# For security, disable extension loading immediately after loading
db.enable_load_extension(False)
# Verify the extension is loaded
vec_version, = db.execute("SELECT vec_version()").fetchone()
print(f"sqlite-vec version: {vec_version}")
# Create a virtual table for vectors using vec0 module
db.execute("CREATE VIRTUAL TABLE documents USING vec0(embedding float[4]);")
# Example embeddings (using numpy for convenience, ensure float32)
embedding1 = np.array([0.1, 0.2, 0.3, 0.4], dtype=np.float32)
embedding2 = np.array([0.5, 0.6, 0.7, 0.8], dtype=np.float32)
embedding3 = np.array([0.15, 0.25, 0.35, 0.45], dtype=np.float32)
# Insert embeddings into the virtual table
# sqlite-vec automatically handles numpy arrays if they implement the Buffer protocol
# For lists, use serialize_float32(list_of_floats)
db.execute("INSERT INTO documents(rowid, embedding) VALUES (?, ?);", (1, embedding1))
db.execute("INSERT INTO documents(rowid, embedding) VALUES (?, ?);", (2, embedding2))
db.execute("INSERT INTO documents(rowid, embedding) VALUES (?, ?);", (3, embedding3))
db.commit()
# Query for nearest neighbors (L2 distance by default)
query_embedding = np.array([0.1, 0.2, 0.3, 0.35], dtype=np.float32)
print("\nNearest neighbors to [0.1, 0.2, 0.3, 0.35]:")
for rowid, distance in db.execute(
"SELECT rowid, distance FROM documents WHERE embedding MATCH ? ORDER BY distance LIMIT 2;",
[query_embedding]
):
print(f"Document ID: {rowid}, Distance: {distance:.4f}")
# Close the database connection
db.close()