LlamaIndex Pinecone Vector Store Integration

0.8.0 · active · verified Thu Apr 16

This library provides the integration for using Pinecone as a vector store backend within LlamaIndex applications. It enables storing and retrieving document embeddings in a Pinecone index for efficient semantic search and Retrieval-Augmented Generation (RAG). As of version 0.8.0, it supports LlamaIndex's modular architecture, requiring separate installation from the core LlamaIndex library. It follows a frequent release cadence, often aligning with LlamaIndex core updates.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to set up a Pinecone index, initialize `PineconeVectorStore`, load documents using `SimpleDirectoryReader`, and build a `VectorStoreIndex` for querying. It assumes `PINECONE_API_KEY` and `OPENAI_API_KEY` are set as environment variables.

import os
from pinecone import Pinecone, ServerlessSpec
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext
from llama_index.vector_stores.pinecone import PineconeVectorStore

# Set your API keys (replace with actual keys or use environment variables)
os.environ['PINECONE_API_KEY'] = os.environ.get('PINECONE_API_KEY', 'YOUR_PINECONE_API_KEY')
os.environ['OPENAI_API_KEY'] = os.environ.get('OPENAI_API_KEY', 'YOUR_OPENAI_API_KEY')

# Initialize Pinecone
pc = Pinecone(api_key=os.environ['PINECONE_API_KEY'])

index_name = "quickstart-index"
if index_name not in pc.list_indexes().names():
    pc.create_index(
        name=index_name,
        dimension=1536, # Dimension for OpenAI's text-embedding-ada-002
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-west-2")
    )

pinecone_index = pc.Index(index_name)

# Initialize PineconeVectorStore
vector_store = PineconeVectorStore(pinecone_index=pinecone_index)

# Load documents (create a 'data' directory with text files or adjust path)
try:
    documents = SimpleDirectoryReader(input_dir="./data").load_data()
except FileNotFoundError:
    print("Please create a 'data' directory and add some text files, or modify SimpleDirectoryReader path.")
    documents = []

if documents:
    # Set up StorageContext
    storage_context = StorageContext.from_defaults(vector_store=vector_store)

    # Create VectorStoreIndex
    index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

    # Query the index
    query_engine = index.as_query_engine()
    response = query_engine.query("What is this document about?")
    print(response.response)
else:
    print("No documents loaded. Skipping index creation and query.")

view raw JSON →