SQLite-vec: Vector Search Extension for SQLite

0.1.9 · active · verified Fri Apr 10

sqlite-vec is a SQLite extension for vector search, written in pure C with no dependencies. It enables storing, manipulating, and querying vector data directly within SQLite files, making it ideal for edge deployments, serverless functions, and local tooling. It supports various vector types (float32, int8, bit) and distance metrics (L1, L2, cosine, Hamming), offering fast brute-force search and SIMD acceleration. The project is pre-v1, so breaking changes are expected. It supports Python, Node.js, Ruby, Rust, and Go bindings.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize a SQLite database with the `sqlite-vec` extension, create a `vec0` virtual table for storing 4-dimensional float embeddings, insert example embeddings (using NumPy arrays, or `serialize_float32` for Python lists), and perform a K-Nearest Neighbors (KNN) search.

import sqlite3
from sqlite_vec import load, serialize_float32
import numpy as np # Often used for embeddings
import os

# Connect to an in-memory SQLite database
db = sqlite3.connect(":memory:")

# Enable loading of SQLite extensions (necessary for sqlite-vec)
db.enable_load_extension(True)

# Load the sqlite-vec extension
load(db)

# For security, disable extension loading immediately after loading
db.enable_load_extension(False)

# Verify the extension is loaded
vec_version, = db.execute("SELECT vec_version()").fetchone()
print(f"sqlite-vec version: {vec_version}")

# Create a virtual table for vectors using vec0 module
db.execute("CREATE VIRTUAL TABLE documents USING vec0(embedding float[4]);")

# Example embeddings (using numpy for convenience, ensure float32)
embedding1 = np.array([0.1, 0.2, 0.3, 0.4], dtype=np.float32)
embedding2 = np.array([0.5, 0.6, 0.7, 0.8], dtype=np.float32)
embedding3 = np.array([0.15, 0.25, 0.35, 0.45], dtype=np.float32)

# Insert embeddings into the virtual table
# sqlite-vec automatically handles numpy arrays if they implement the Buffer protocol
# For lists, use serialize_float32(list_of_floats)
db.execute("INSERT INTO documents(rowid, embedding) VALUES (?, ?);", (1, embedding1))
db.execute("INSERT INTO documents(rowid, embedding) VALUES (?, ?);", (2, embedding2))
db.execute("INSERT INTO documents(rowid, embedding) VALUES (?, ?);", (3, embedding3))
db.commit()

# Query for nearest neighbors (L2 distance by default)
query_embedding = np.array([0.1, 0.2, 0.3, 0.35], dtype=np.float32)

print("\nNearest neighbors to [0.1, 0.2, 0.3, 0.35]:")
for rowid, distance in db.execute(
    "SELECT rowid, distance FROM documents WHERE embedding MATCH ? ORDER BY distance LIMIT 2;",
    [query_embedding]
):
    print(f"Document ID: {rowid}, Distance: {distance:.4f}")

# Close the database connection
db.close()

view raw JSON →