LlamaIndex Neo4j Vector Store
The `llama-index-vector-stores-neo4jvector` library provides an integration for LlamaIndex to use Neo4j as a vector store. It enables storing, indexing, and querying of document embeddings within a Neo4j graph database, supporting operations like vector search, hybrid search, and metadata filtering. The current version is 0.6.0, and it maintains an active development and release cadence within the LlamaIndex ecosystem.
Common errors
-
ModuleNotFoundError: No module named 'llama_index.vector_stores'
cause Attempting to import `Neo4jVectorStore` (or other vector stores) from `llama_index.vector_stores` instead of the dedicated integration package path after LlamaIndex v0.10+ refactor.fixInstall the specific integration package: `pip install llama-index-vector-stores-neo4jvector`. Then, update the import statement to `from llama_index.vector_stores.neo4jvector import Neo4jVectorStore`. -
neo4j.exceptions.ClientError: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure `db.index.vector.queryNodes`: Caused by: java.lang.IllegalArgumentException: Index query vector has X dimensions, but indexed vectors have Y.}cause The `embedding_dimension` configured in `Neo4jVectorStore` does not match the actual dimension of the vectors stored in the Neo4j vector index, or the embedding model used for querying generates vectors of a different dimension.fixVerify that `embedding_dimension` parameter in `Neo4jVectorStore` matches your embedding model's output dimension. If an index already exists, ensure its dimension is consistent or recreate the Neo4j vector index with the correct `embedding_dimension`. -
ValueError: Could not connect to Neo4j database. Ensure the URL and credentials are correct and the database is running.
cause Incorrect Neo4j URI, username, password, or the Neo4j database instance is not running or accessible from the application.fixDouble-check `NEO4J_URI`, `NEO4J_USERNAME`, and `NEO4J_PASSWORD` environment variables or constructor arguments. Ensure the Neo4j instance is running and network accessible (e.g., `bolt://localhost:7687`).
Warnings
- breaking LlamaIndex v0.10+ introduced a major package refactor. All integrations, including Neo4jVectorStore, are now standalone PyPI packages. Direct imports from `llama_index.vector_stores` are no longer valid for integration classes.
- gotcha Mismatch between `embedding_dimension` specified in `Neo4jVectorStore` and the actual dimension of indexed vectors in the Neo4j database will cause query failures (e.g., `java.lang.IllegalArgumentException: Index query vector has X dimensions, but indexed vectors have Y.`).
- gotcha Creating vector indexes in Neo4j requires a Neo4j database version that supports vector index capabilities (e.g., Neo4j 5.5 or later). Using older versions may result in `Invalid input 'VECTOR'` or similar Cypher errors when the store attempts to create an index.
Install
-
pip install llama-index-vector-stores-neo4jvector llama-index-llms-openai llama-index-embeddings-openai neo4j
Imports
- Neo4jVectorStore
from llama_index.vector_stores import Neo4jVectorStore
from llama_index.vector_stores.neo4jvector import Neo4jVectorStore
- VectorStoreIndex
from llama_index.core import VectorStoreIndex
Quickstart
import os
from llama_index.vector_stores.neo4jvector import Neo4jVectorStore
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.core.embeddings import resolve_embed_model
# Set environment variables for Neo4j and OpenAI
# Ensure you have a running Neo4j instance (e.g., Docker, AuraDB)
# NEO4J_URI = "bolt://localhost:7687"
# NEO4J_USERNAME = "neo4j"
# NEO4J_PASSWORD = "password"
# OPENAI_API_KEY = "sk-..."
# Configure LlamaIndex settings (LLM and Embedding Model)
Settings.embed_model = resolve_embed_model("openai") # Or any other embedding model
# Neo4j connection details
neo4j_url = os.environ.get("NEO4J_URI", "bolt://localhost:7687")
neo4j_username = os.environ.get("NEO4J_USERNAME", "neo4j")
neo4j_password = os.environ.get("NEO4J_PASSWORD", "password")
embedding_dimension = 1536 # OpenAI's default embedding dimension
# Initialize Neo4jVectorStore
try:
neo4j_vector_store = Neo4jVectorStore(
username=neo4j_username,
password=neo4j_password,
url=neo4j_url,
embedding_dimension=embedding_dimension,
index_name="vector",
node_label="Chunk",
embedding_node_property="embedding",
text_node_property="text",
)
except ValueError as e:
print(f"Error connecting to Neo4j: {e}. Please ensure Neo4j is running and credentials are correct.")
exit()
# Create a dummy document for demonstration (in a 'data' directory)
# You might need to create a 'data' directory and a 'test.txt' file
# e.g., echo "This is a test document about LlamaIndex and Neo4j integration." > data/test.txt
# Load documents
# Ensure 'data' directory exists and contains documents
try:
documents = SimpleDirectoryReader("data").load_data()
except Exception as e:
print(f"Error loading documents: {e}. Make sure a 'data' directory exists and contains files.")
documents = []
if not documents:
print("No documents loaded. Creating a dummy document.")
from llama_index.core.schema import Document
documents = [Document(text="LlamaIndex integrates with Neo4j to provide a powerful vector store for RAG applications.")]
# Create a VectorStoreIndex
index = VectorStoreIndex.from_documents(documents, vector_store=neo4j_vector_store)
# Query the index
query_engine = index.as_query_engine()
response = query_engine.query("What is LlamaIndex?")
print(response)
# Example of retrieving documents directly (without LLM synthesis)
retriever = index.as_retriever(similarity_top_k=2)
nodes = retriever.retrieve("How does LlamaIndex work with Neo4j?")
for node in nodes:
print(f"Retrieved Node: {node.text[:100]}...")
# Clean up (optional, depends on your use case)
# neo4j_vector_store._driver.close()