LangChain Astra DB Integration

1.0.0 · active · verified Thu Apr 16

LangChain Astra DB is an integration package that connects DataStax Astra DB with the LangChain framework. It provides components such as a vector store, chat message history, and caching mechanisms, allowing developers to leverage Astra DB's serverless vector capabilities for their GenAI and RAG applications. The library is actively maintained, with a focus on seamless integration and compatibility within the LangChain ecosystem.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize `AstraDBVectorStore` with OpenAI embeddings, add documents to a collection, and perform a similarity search. Ensure `ASTRA_DB_API_ENDPOINT`, `ASTRA_DB_APPLICATION_TOKEN`, and `OPENAI_API_KEY` are set as environment variables.

import os
from langchain_astradb import AstraDBVectorStore
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document

# Ensure you have your Astra DB credentials and OpenAI API key set as environment variables
ASTRA_DB_API_ENDPOINT = os.environ.get('ASTRA_DB_API_ENDPOINT', 'https://your.api.endpoint')
ASTRA_DB_APPLICATION_TOKEN = os.environ.get('ASTRA_DB_APPLICATION_TOKEN', 'AstraCS:...')
OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY', 'sk-...')

if not all([ASTRA_DB_API_ENDPOINT, ASTRA_DB_APPLICATION_TOKEN, OPENAI_API_KEY]):
    print("Please set ASTRA_DB_API_ENDPOINT, ASTRA_DB_APPLICATION_TOKEN, and OPENAI_API_KEY environment variables.")
else:
    print("Connecting to Astra DB Vector Store...")
    embeddings = OpenAIEmbeddings(api_key=OPENAI_API_KEY)
    
    # Initialize the vector store
    vector_store = AstraDBVectorStore(
        embedding=embeddings,
        collection_name="my_documents_collection",
        api_endpoint=ASTRA_DB_API_ENDPOINT,
        token=ASTRA_DB_APPLICATION_TOKEN,
    )

    # Add documents
    docs = [
        Document(page_content="LangChain provides tools for building LLM applications.", metadata={"source": "langchain"}),
        Document(page_content="Astra DB is a serverless vector-capable database.", metadata={"source": "astradb"})
    ]
    vector_store.add_documents(docs)

    print(f"Added {len(docs)} documents to the collection.")

    # Perform a similarity search
    query = "What is Astra DB?"
    results = vector_store.similarity_search(query, k=1)
    
    print(f"\nSimilarity search results for '{query}':")
    for doc in results:
        print(f"- Content: {doc.page_content}, Source: {doc.metadata.get('source')}")

view raw JSON →