LangChain Astra DB Integration
LangChain Astra DB is an integration package that connects DataStax Astra DB with the LangChain framework. It provides components such as a vector store, chat message history, and caching mechanisms, allowing developers to leverage Astra DB's serverless vector capabilities for their GenAI and RAG applications. The library is actively maintained, with a focus on seamless integration and compatibility within the LangChain ecosystem.
Common errors
-
Collection 'default_keyspace.my_testing_collection' already exists CollectionAlreadyExistsError: Collection 'default_keyspace.my_testing_collection' already exists
cause Attempting to create an Astra DB collection with `AstraDBVectorStore` when a collection with the same name already exists but with a different configuration (e.g., vector or hybrid search settings).fixEither delete the existing collection and let `AstraDBVectorStore` create a new one, or pass `setup_mode=langchain_astradb.utils.astradb.SetupMode.OFF` to the `AstraDBVectorStore` constructor to connect to the existing collection without attempting to modify its creation parameters. -
Unable to connect to Cassandra (AstraDB)
cause This is a generic error often caused by incorrect `ASTRA_DB_API_ENDPOINT` or `ASTRA_DB_APPLICATION_TOKEN`, network issues, or database unavailability.fixVerify that `ASTRA_DB_API_ENDPOINT` and `ASTRA_DB_APPLICATION_TOKEN` are copied exactly from your Astra DB console. Check your internet connection and ensure the Astra DB instance is active and reachable.
Warnings
- breaking The `langchain-astradb` package replaces deprecated Astra DB classes previously found under `langchain_community.*`. Migrating to the dedicated `langchain-astradb` package is strongly advised for the latest features, fixes, and compatibility with modern `astrapy` versions.
- gotcha Initializing an `AstraDBVectorStore` on an existing collection may fail if the collection's configuration (e.g., indexing settings, hybrid search enablement) differs from the default or requested settings. The Data API returns an `EXISTING_COLLECTION_DIFFERENT_SETTINGS` error.
- gotcha Incorrect or invalid credentials (`ASTRA_DB_API_ENDPOINT`, `ASTRA_DB_APPLICATION_TOKEN`) or issues with the secure bundle path for self-managed Cassandra can lead to connection errors.
Install
-
pip install langchain-astradb langchain-openai astrapy
Imports
- AstraDBVectorStore
from langchain_astradb import AstraDBVectorStore
- AstraDBChatMessageHistory
from langchain_astradb import AstraDBChatMessageHistory
- AstraDBCache
from langchain_astradb import AstraDBCache
- AstraDBSemanticCache
from langchain_astradb import AstraDBSemanticCache
- AstraDBLoader
from langchain_astradb import AstraDBLoader
Quickstart
import os
from langchain_astradb import AstraDBVectorStore
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
# Ensure you have your Astra DB credentials and OpenAI API key set as environment variables
ASTRA_DB_API_ENDPOINT = os.environ.get('ASTRA_DB_API_ENDPOINT', 'https://your.api.endpoint')
ASTRA_DB_APPLICATION_TOKEN = os.environ.get('ASTRA_DB_APPLICATION_TOKEN', 'AstraCS:...')
OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY', 'sk-...')
if not all([ASTRA_DB_API_ENDPOINT, ASTRA_DB_APPLICATION_TOKEN, OPENAI_API_KEY]):
print("Please set ASTRA_DB_API_ENDPOINT, ASTRA_DB_APPLICATION_TOKEN, and OPENAI_API_KEY environment variables.")
else:
print("Connecting to Astra DB Vector Store...")
embeddings = OpenAIEmbeddings(api_key=OPENAI_API_KEY)
# Initialize the vector store
vector_store = AstraDBVectorStore(
embedding=embeddings,
collection_name="my_documents_collection",
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
)
# Add documents
docs = [
Document(page_content="LangChain provides tools for building LLM applications.", metadata={"source": "langchain"}),
Document(page_content="Astra DB is a serverless vector-capable database.", metadata={"source": "astradb"})
]
vector_store.add_documents(docs)
print(f"Added {len(docs)} documents to the collection.")
# Perform a similarity search
query = "What is Astra DB?"
results = vector_store.similarity_search(query, k=1)
print(f"\nSimilarity search results for '{query}':")
for doc in results:
print(f"- Content: {doc.page_content}, Source: {doc.metadata.get('source')}")