LangChain Elasticsearch Integration
LangChain Elasticsearch is an integration package that connects LangChain with Elasticsearch. It provides components for vector storage (`ElasticsearchStore`), chat message history (`ElasticsearchChatMessageHistory`), and embedding caching (`ElasticsearchEmbeddingsCache`). The current version is 1.0.0, and it follows a frequent release cadence, often aligning with LangChain Core updates.
Common errors
-
ModuleNotFoundError: No module named 'langchain.vectorstores.elasticsearch'
cause You are trying to import the Elasticsearch vectorstore from the old `langchain` path, but the integration has moved to its own package.fixInstall `langchain-elasticsearch` and update your import to `from langchain_elasticsearch import ElasticsearchStore`. -
elasticsearch.exceptions.ConnectionError: Connection refused
cause The Elasticsearch client cannot connect to the specified URL, likely because Elasticsearch is not running, is on a different port, or behind a firewall.fixEnsure your Elasticsearch instance is running and accessible from where your Python code is executed. Verify `ELASTICSEARCH_URL` is correct (e.g., `http://localhost:9200`). -
TypeError: 'NoneType' object is not subscriptable
cause This often occurs during vector search if embeddings are not properly initialized or `num_dimensions` was not set, leading to issues with vector processing in Elasticsearch.fixEnsure your embedding model is correctly instantiated and provides valid embeddings. If creating a new index, specify `num_dimensions` in `ElasticsearchStore` initialization.
Warnings
- breaking The Elasticsearch integration was extracted from the main `langchain` package into `langchain-elasticsearch`. This changes import paths.
- gotcha When creating a new vector index in Elasticsearch, it is highly recommended to explicitly provide the `num_dimensions` parameter for your embeddings to `ElasticsearchStore` to ensure correct vector field mapping.
- gotcha Asynchronous methods (e.g., `aadd_documents`, `asimilarity_search`) were introduced in version `0.3.1`. Attempting to use them on earlier versions will result in an `AttributeError`.
Install
-
pip install langchain-elasticsearch -
pip install langchain-elasticsearch elasticsearch langchain-openai
Imports
- ElasticsearchStore
from langchain.vectorstores.elasticsearch import ElasticsearchStore
from langchain_elasticsearch import ElasticsearchStore
- ElasticsearchChatMessageHistory
from langchain_elasticsearch import ElasticsearchChatMessageHistory
- ElasticsearchEmbeddingsCache
from langchain_elasticsearch import ElasticsearchEmbeddingsCache
Quickstart
import os
from langchain_elasticsearch import ElasticsearchStore
from langchain_openai import OpenAIEmbeddings # Or any other Embedding class
from langchain_core.documents import Document
# Set up Elasticsearch client URL and OpenAI API Key
ELASTICSEARCH_URL = os.environ.get("ELASTICSEARCH_URL", "http://localhost:9200")
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "")
if not OPENAI_API_KEY:
print("Warning: OPENAI_API_KEY not set. Using a placeholder for demonstration.")
# In a real application, you would ensure this is set or use a mock.
embeddings = None # Prevent actual API calls
else:
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)
if embeddings:
# Initialize ElasticsearchStore
# Ensure Elasticsearch is running and accessible at ELASTICSEARCH_URL
vectorstore = ElasticsearchStore(
es_url=ELASTICSEARCH_URL,
index_name="langchain-test-index",
embedding=embeddings,
# num_dimensions is crucial for correct vector mapping
num_dimensions=1536 # For OpenAI embeddings
)
# Add documents
docs = [
Document(page_content="The quick brown fox jumps over the lazy dog", metadata={"source": "lorem"}),
Document(page_content="A dog barks at the moon", metadata={"source": "nature"}),
]
vectorstore.add_documents(docs)
# Perform a similarity search
query = "What is a fox?"
results = vectorstore.similarity_search(query, k=1)
print(f"\nSimilarity search results for '{query}':")
for doc in results:
print(f"- Content: {doc.page_content}, Metadata: {doc.metadata}")
else:
print("Embeddings not initialized due to missing API key. Skipping vector store example.")