ChromaDB Client

1.5.7 · active · verified Mon Apr 13

ChromaDB Client is a lightweight Python HTTP client for interacting with a running ChromaDB server. It provides a programmatic interface to store, query, and manage vector embeddings, enabling integration with AI applications for tasks like semantic search and Retrieval-Augmented Generation (RAG). The library is actively maintained with frequent releases, typically on a weekly or bi-weekly cadence.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to connect to a ChromaDB server using `chromadb.HttpClient`, create a collection, add documents with metadata, and perform a similarity search. It includes an example of filtering results by metadata and ensuring embeddings are returned.

import chromadb
import os

# Connect to a running ChromaDB server (replace with your server's host and port)
# For local testing, ensure a ChromaDB server is running (e.g., `chroma run --path /path/to/db`)
# Or connect to a Chroma Cloud instance.
client = chromadb.HttpClient(host=os.environ.get('CHROMADB_HOST', 'localhost'), port=int(os.environ.get('CHROMADB_PORT', 8000)))

# Create a collection
collection_name = "my_documents"
try:
    collection = client.create_collection(name=collection_name)
    print(f"Collection '{collection_name}' created.")
except Exception as e:
    print(f"Collection '{collection_name}' might already exist or error: {e}. Attempting to get it.")
    collection = client.get_or_create_collection(name=collection_name)

# Add documents to the collection
collection.add(
    documents=["This is document1 about cats", "This is document2 about dogs", "A third document on artificial intelligence"],
    metadatas=[
        {"source": "blog", "category": "pets"},
        {"source": "website", "category": "pets"},
        {"source": "article", "category": "tech"}
    ],
    ids=["doc1", "doc2", "doc3"]
)
print(f"Added {collection.count()} documents.")

# Query the collection
results = collection.query(
    query_texts=["tell me about animals"],
    n_results=2,
    where={"category": "pets"},
    include=["documents", "metadatas", "distances"]
)

print("\nQuery Results:")
for i, doc in enumerate(results['documents'][0]):
    print(f"  Document: {doc}")
    print(f"  Metadata: {results['metadatas'][0][i]}")
    print(f"  Distance: {results['distances'][0][i]}")

# Clean up (optional, for demonstration)
# client.delete_collection(name=collection_name)
# print(f"Collection '{collection_name}' deleted.")

view raw JSON →