pgvector

0.8.2 verified Tue May 12 auth: no python install: stale quickstart: stale

Open-source PostgreSQL extension for vector similarity search. Two components: (1) the server-side Postgres extension (C, compiled and installed into Postgres), and (2) the Python client package 'pgvector' on PyPI which provides ORM/adapter integrations for psycopg2, psycopg3, asyncpg, SQLAlchemy, Django, SQLModel, and Peewee. The extension name in SQL is 'vector' (CREATE EXTENSION vector), not 'pgvector'. Maintained by Andrew Kane. Current extension version: 0.8.2 (CVE security fix). Python client: 0.4.2.

pip install pgvector

Common errors

error ModuleNotFoundError: No module named 'pgvector' ↓

cause The Python client package 'pgvector' is not installed in the environment where the code is being run.

fix

Install the pgvector Python package using pip: pip install pgvector

error ERROR: type "vector" does not exist ↓

cause The PostgreSQL 'vector' extension has not been created in the database or the current user does not have access to it, preventing the use of the `vector` data type.

fix

Connect to your PostgreSQL database and execute the SQL command: CREATE EXTENSION IF NOT EXISTS vector; Ensure you are connected to the correct database and have sufficient permissions.

error operator does not exist: vector <-> double precision[] ↓

cause This error typically occurs when trying to use pgvector operators (like `<->` for L2 distance) with an incompatible data type or when the vector extension is not in the active search path. It can also happen if the input vector is not explicitly cast to the `vector` type.

fix

Ensure the input array is explicitly cast to vector in your SQL query, for example: SELECT * FROM items ORDER BY embedding <-> %s::vector LIMIT 1;. If using a Python client, ensure you are passing a compatible type or register the vector type with the driver (e.g., pgvector.psycopg.register_vector(conn) for psycopg).

error ModuleNotFoundError: No module named 'pgvector.sqlalchemy'; 'pgvector' is not a package ↓

cause This specific error indicates an incorrect import path for SQLAlchemy integration with pgvector, particularly in applications using `langchain_postgres`. The `Vector` type for SQLAlchemy is often directly available from the `pgvector` top-level package or an older/different `langchain` integration might be attempting a non-existent sub-module import.

fix

For SQLAlchemy integration, import the Vector type directly from pgvector (e.g., from pgvector.sqlalchemy import Vector might be incorrect, try from pgvector.sqlalchemy import VectorColumn or depending on the library version, the Vector type might be directly available after importing pgvector or used as a type annotation if pgvector is registered as a dialect). If using LangChain, ensure you have the correct langchain-postgres package and its compatible pgvector version installed.

Warnings

breaking CVE-2026-3172: Buffer overflow with parallel HNSW index builds in versions 0.6.0–0.8.1. Can leak sensitive data from other relations or crash the database server. Fixed in 0.8.2. ↓

fix Upgrade to pgvector 0.8.2 immediately.

breaking Illegal instruction crashes (SIGILL) when pgvector is compiled with -march=native on one CPU architecture and run on another. Occurs on managed cloud Postgres (Azure Flexible Server, some GCP instances) after upgrading to 0.8.0+. ↓

fix Report to your cloud provider. If self-hosting, compile on the same CPU architecture as the runtime. Cannot be worked around from the client side.

breaking LangChain's langchain-postgres package requires psycopg3 (package name: psycopg). Connection strings must use postgresql+psycopg:// not postgresql+psycopg2://. Mixing drivers causes driver-not-found errors. ↓

fix pip install psycopg[binary]. Use connection string postgresql+psycopg://user:pass@host/db.

breaking Postgres 17.0–17.2 causes link error: 'unresolved external symbol float_to_shortest_decimal_bufn' when building pgvector from source. ↓

fix Upgrade to Postgres 17.3+.

gotcha The SQL extension name is 'vector', not 'pgvector'. CREATE EXTENSION pgvector raises 'extension not found'. This is a consistent source of confusion. ↓

fix Always use: CREATE EXTENSION IF NOT EXISTS vector;

gotcha register_vector(conn) must be called after every new connection. It is not persistent. Failing to call it means vector columns are returned as raw strings, not numpy arrays. No error is raised — silent wrong behavior. ↓

fix Call register_vector(conn) immediately after psycopg2.connect(). For connection pools, call it in the connection setup callback.

gotcha HNSW and IVFFlat indexes without ORDER BY + LIMIT do not use the ANN index — Postgres falls back to sequential scan. Queries without LIMIT return exact results but at O(n) cost. ↓

fix Always include ORDER BY embedding <-> $1 LIMIT k in vector search queries. Without LIMIT, the index is not used.

gotcha COSINE distance in pgvector uses the range [0, 2], not [0, 1]. 0 = identical, 2 = opposite. Thresholds from other libraries (which use [0,1]) must be remapped. ↓

fix Use pgvector cosine thresholds in [0, 2]. Equivalent: pgvector_threshold = 1 - cosine_similarity.

gotcha IVFFlat index must be built AFTER data is loaded. Creating the index on an empty table and then inserting data results in a near-useless index (lists are not representative of the data distribution). ↓

fix Load all or most data first, then run CREATE INDEX. For ongoing ingestion, rebuild or use HNSW which handles incremental inserts better.

Install

sudo apt install postgresql-17-pgvector

brew install pgvector

docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=pass pgvector/pgvector:pg17

git clone --branch v0.8.2 https://github.com/pgvector/pgvector.git && cd pgvector && make && make install

Install compatibility stale last tested: 2026-05-12

python os / libc status wheel install import disk

3.10 alpine (musl) - - - -

3.10 slim (glibc) - - - -

3.11 alpine (musl) - - - -

3.11 slim (glibc) - - - -

3.12 alpine (musl) - - - -

3.12 slim (glibc) - - - -

3.13 alpine (musl) - - - -

3.13 slim (glibc) - - - -

3.9 alpine (musl) - - - -

3.9 slim (glibc) - - - -

Imports

register_vector (psycopg2)
wrong
```
import pgvector
```
correct
```
from pgvector.psycopg2 import register_vector
```
Top-level 'import pgvector' does nothing useful. Must import from the submodule matching your driver.
register_vector (psycopg3)
wrong
```
from pgvector.psycopg2 import register_vector
```
correct
```
from pgvector.psycopg import register_vector
```
psycopg2 and psycopg3 use different submodules. Mixing them causes AttributeError.
Vector (SQLAlchemy)
```
from pgvector.sqlalchemy import Vector
```
Use HALFVEC, BIT, SPARSEVEC from the same module for other vector types.

Quickstart stale last tested: 2026-05-12

register_vector(conn) must be called after connecting — it registers the custom 'vector' type with psycopg2. Without it, vectors are returned as strings. Extension must be enabled server-side first with CREATE EXTENSION vector.

# Step 1: Enable extension in Postgres (run once per database)
# CREATE EXTENSION IF NOT EXISTS vector;

import psycopg2
from pgvector.psycopg2 import register_vector
import numpy as np

conn = psycopg2.connect("dbname=mydb user=postgres")
register_vector(conn)  # REQUIRED: registers the vector type

cur = conn.cursor()
cur.execute("CREATE TABLE IF NOT EXISTS items (id bigserial PRIMARY KEY, embedding vector(3))")

# Insert vectors
cur.execute("INSERT INTO items (embedding) VALUES (%s)", (np.array([1.0, 2.0, 3.0], dtype='float32'),))
conn.commit()

# L2 distance search (<->)
cur.execute("SELECT id FROM items ORDER BY embedding <-> %s LIMIT 5", (np.array([1.0, 1.0, 1.0], dtype='float32'),))
print(cur.fetchall())

cur.close()
conn.close()