LangChain Graph Retriever
langchain-graph-retriever is a specialized LangChain retriever for traversing document graphs built on top of vector-based similarity search. It enables RAG applications to perform contextual retrieval by following relationships between documents, moving beyond simple similarity. Currently at version 0.8.0, it maintains a frequent release cadence, typically with minor updates every few weeks.
Common errors
-
ModuleNotFoundError: No module named 'langchain_graph_retriever'
cause The `langchain-graph-retriever` package is not installed in your Python environment.fixRun `pip install langchain-graph-retriever` to install the package. -
TypeError: GraphRetriever.__init__ got an unexpected keyword argument 'strategy_type'
cause You are using an old parameter name for configuring retrieval strategies that was removed or renamed in version 0.5.0 or later.fixUpdate your `GraphRetriever` initialization code to use the current strategy configuration parameters as per the latest documentation. The strategy design was updated in v0.5.0. -
TypeError: Expected an Id instance for the document ID, but received a string.
cause You are passing a string as a document ID where the `Id()` class was expected for older versions (pre-v0.6.0).fixThis error should primarily occur if using versions prior to v0.6.0. If on v0.6.0 or later, ensure you are passing the string `'$id'` directly in edge definitions, as the `Id()` class was deprecated in favor of '$id' strings.
Warnings
- breaking Major changes to strategy design and parameter names in v0.5.0. The internal architecture for traversal and node selection was refactored, leading to updated API signatures for strategy configuration.
- gotcha The `Id()` class for representing document IDs on edges was replaced by directly passing the `'$id'` string in v0.6.0.
- gotcha The `k` parameter (number of initial documents to retrieve) was affected by the v0.5.0 strategy changes and then restored in v0.5.1.
Install
-
pip install langchain-graph-retriever "astrapy[graphs]" langchain-community openai
Imports
- GraphRetriever
from langchain.graph_retriever import GraphRetriever
from langchain_graph_retriever import GraphRetriever
Quickstart
import os
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain_graph_retriever import GraphRetriever
# Set your OpenAI API key. This example uses a mock key if not set in environment.
# For actual use, ensure OPENAI_API_KEY is properly configured.
os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY", "sk-YOUR_OPENAI_API_KEY")
if not os.environ["OPENAI_API_KEY"].startswith("sk-") or os.environ["OPENAI_API_KEY"] == "sk-YOUR_OPENAI_API_KEY":
print("OPENAI_API_KEY not set or is placeholder. Skipping quickstart execution.")
print("Please set the OPENAI_API_KEY environment variable for a functional example.")
else:
try:
# 1. Initialize Embeddings (requires 'openai' package)
embeddings = OpenAIEmbeddings()
# 2. Create a dummy vector store (in-memory Chroma for demonstration, requires 'langchain-community' package)
# In a real application, this would be populated with documents and their graph relationships
documents = [
Document(page_content="The quick brown fox jumps over the lazy dog.", metadata={"doc_id": "doc1"}),
Document(page_content="The dog is lazy and enjoys napping.", metadata={"doc_id": "doc2"}),
Document(page_content="A fox is a small omnivorous mammal.", metadata={"doc_id": "doc3"}),
Document(page_content="Dogs are domesticated canids.", metadata={"doc_id": "doc4"})
]
vectorstore = Chroma.from_documents(documents, embeddings)
# 3. Initialize the GraphRetriever
# The GraphRetriever works by taking initial search results from the vector store
# and then traversing the graph of related documents. The 'k' and 'depth' parameters
# control the initial vector search and subsequent graph traversal depth.
retriever = GraphRetriever(
vectorstore=vectorstore,
k=2, # Number of initial documents to retrieve from the vectorstore
depth=1 # How many levels deep to traverse the graph from the initial documents
# (requires graph relationships to be defined in your actual vector store/graph DB)
)
# 4. Perform a retrieval
query = "Tell me about animals."
relevant_docs = retriever.get_relevant_documents(query)
print(f"\nRetrieved {len(relevant_docs)} documents:")
for i, doc in enumerate(relevant_docs):
print(f"--- Document {i+1} ---")
print(f"Content: {doc.page_content}")
print(f"Metadata: {doc.metadata}")
except Exception as e:
print(f"An error occurred during quickstart execution: {e}")