ChromaDB
Open-source embedded vector database for AI applications. Runs in-process (EphemeralClient, PersistentClient) or client-server mode (HttpClient). Handles embedding storage, metadata filtering, and similarity search. Supports pluggable embedding functions. Core backend rewritten in Rust in 1.x; also ships a lightweight HTTP-only client as the separate chromadb-client package.
Warnings
- breaking chromadb.Client(Settings(...)) removed in 0.4.0. Enormous volume of tutorials, LangChain/LlamaIndex integration examples, and LLM-generated code still uses it. Raises AttributeError or TypeError on import.
- breaking Database migrations between Chroma versions are irreversible. Upgrading the chromadb package upgrades on-disk data format. Downgrading after upgrade causes data loss or corruption.
- breaking Server CORS and auth configuration moved from environment variables to a YAML config file in the 1.x Rust-backed server. Environment variables like CHROMA_SERVER_CORS_ALLOW_ORIGINS and CHROMA_SERVER_AUTH_CREDENTIALS no longer work.
- gotcha Default embedding function downloads ~200MB of model weights (all-MiniLM-L6-v2 via onnxruntime) on first call. First add() or query() call in a new environment hangs while downloading. No progress indicator.
- gotcha PersistentClient does not support concurrent access from multiple processes. SQLite-backed storage uses file locking. Multiple processes writing to the same path cause database corruption or blocked writes.
- gotcha collection.query() where= filter uses a specific operator syntax ($eq, $ne, $gt, $gte, $lt, $lte, $in, $nin, $and, $or). Plain dict equality {"key": "value"} is not valid — must be {"key": {"": "value"}}. Raises ValueError silently in old versions, error in new.
- gotcha Telemetry is enabled by default (sends anonymized usage data to PostHog). Runs on every client init.
Install
-
pip install chromadb -
pip install chromadb-client
Imports
- EphemeralClient / PersistentClient / HttpClient
import chromadb client = chromadb.EphemeralClient() # in-memory client = chromadb.PersistentClient(path="/db") # disk client = chromadb.HttpClient(host="localhost", port=8000) # server
Quickstart
import sys
if sys.version_info < (3, 9):
raise RuntimeError("chromadb requires Python 3.9+. Current: " +
sys.version)
import chromadb
# In-memory (prototyping)
client = chromadb.EphemeralClient()
# Persistent (local dev)
# client =
chromadb.PersistentClient(path="/path/to/db")
collection = client.get_or_create_collection("my_docs")
collection.add(
documents=["This is doc one", "This is doc two"],
ids=["id1", "id2"],
)
results = collection.query(
query_texts=["find
something"],
n_results=2,
)
print(results)