LlamaIndex Pinecone Vector Store Integration
This library provides the integration for using Pinecone as a vector store backend within LlamaIndex applications. It enables storing and retrieving document embeddings in a Pinecone index for efficient semantic search and Retrieval-Augmented Generation (RAG). As of version 0.8.0, it supports LlamaIndex's modular architecture, requiring separate installation from the core LlamaIndex library. It follows a frequent release cadence, often aligning with LlamaIndex core updates.
Common errors
-
AttributeError: 'PineconeVectorStore' object has no attribute 'service_context'
cause `ServiceContext` was deprecated in LlamaIndex v0.10.0 and `PineconeVectorStore` no longer relies on it directly.fixRemove any explicit usage of `service_context` when initializing `PineconeVectorStore` or `VectorStoreIndex`. Configure LLM and embedding models directly using `Settings` or by passing them as arguments to `VectorStoreIndex.from_documents()`. -
pinecone.exceptions.PineconeException: The dimension of the vectors to be upserted (X) does not match the dimension of the index (Y).
cause The vector dimension generated by your embedding model does not match the dimension specified when creating the Pinecone index.fixEnsure the `dimension` parameter in `pc.create_index()` matches the output dimension of your embedding model. For example, if using OpenAI's `text-embedding-ada-002`, the dimension should be 1536. -
Index 'your-index-name' is not ready. Please wait a few seconds and try again.
cause The Pinecone index creation can take a short amount of time to become active and ready for operations.fixImplement a retry mechanism with a short delay (e.g., `time.sleep(5)`) or check `pc.describe_index(index_name).status` before proceeding with upserts or queries.
Warnings
- breaking LlamaIndex v0.10.0 introduced a major packaging refactor. Core components moved to `llama-index-core`, and integrations like `pinecone-vector-store` are now separate PyPI packages.
- gotcha Pinecone index dimensions must match the embedding model's output dimension. Mismatched dimensions will lead to errors during upsert operations.
- gotcha Inconsistent or empty query results from Pinecone often stem from issues with API keys, index state, overly restrictive filters, or problems during document ingestion.
- gotcha Compatibility issues can arise when `pinecone-client` is installed alongside other libraries that also depend on it (e.g., `langchain-pinecone`), leading to version downgrades or conflicts.
Install
-
pip install llama-index-vector-stores-pinecone -
pip install llama-index llama-index-vector-stores-pinecone pinecone-client openai
Imports
- PineconeVectorStore
from llama_index.vector_stores.pinecone import PineconeVectorStore
- VectorStoreIndex
from llama_index.core import VectorStoreIndex
- SimpleDirectoryReader
from llama_index.core import SimpleDirectoryReader
- StorageContext
from llama_index.core import StorageContext
- Pinecone
from llama_index.vector_stores.pinecone import Pinecone
from pinecone import Pinecone
Quickstart
import os
from pinecone import Pinecone, ServerlessSpec
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext
from llama_index.vector_stores.pinecone import PineconeVectorStore
# Set your API keys (replace with actual keys or use environment variables)
os.environ['PINECONE_API_KEY'] = os.environ.get('PINECONE_API_KEY', 'YOUR_PINECONE_API_KEY')
os.environ['OPENAI_API_KEY'] = os.environ.get('OPENAI_API_KEY', 'YOUR_OPENAI_API_KEY')
# Initialize Pinecone
pc = Pinecone(api_key=os.environ['PINECONE_API_KEY'])
index_name = "quickstart-index"
if index_name not in pc.list_indexes().names():
pc.create_index(
name=index_name,
dimension=1536, # Dimension for OpenAI's text-embedding-ada-002
metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-west-2")
)
pinecone_index = pc.Index(index_name)
# Initialize PineconeVectorStore
vector_store = PineconeVectorStore(pinecone_index=pinecone_index)
# Load documents (create a 'data' directory with text files or adjust path)
try:
documents = SimpleDirectoryReader(input_dir="./data").load_data()
except FileNotFoundError:
print("Please create a 'data' directory and add some text files, or modify SimpleDirectoryReader path.")
documents = []
if documents:
# Set up StorageContext
storage_context = StorageContext.from_defaults(vector_store=vector_store)
# Create VectorStoreIndex
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
# Query the index
query_engine = index.as_query_engine()
response = query_engine.query("What is this document about?")
print(response.response)
else:
print("No documents loaded. Skipping index creation and query.")