LangChain Pinecone Integration

0.2.13 · active · verified Sun Apr 12

langchain-pinecone is an integration package that connects LangChain applications with Pinecone, a leading vector database. It facilitates storing, retrieving, and managing vector embeddings to power AI search, recommendation, and generative AI features within the LangChain ecosystem. The current version is 0.2.13 and it follows the release cadence of the broader LangChain ecosystem, with frequent updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the Pinecone client, create a new Pinecone index (if it doesn't exist), embed documents using OpenAIEmbeddings, store them in the Pinecone vector store, and perform a similarity search. Ensure you have your `PINECONE_API_KEY`, `PINECONE_ENVIRONMENT`, and `OPENAI_API_KEY` set as environment variables or replaced in the code.

import os
from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
from pinecone import Pinecone, ServerlessSpec

# --- Configuration (replace with your actual keys and environment) ---
# It's recommended to set these as environment variables.
PINECONE_API_KEY = os.environ.get("PINECONE_API_KEY", "YOUR_PINECONE_API_KEY")
PINECONE_ENVIRONMENT = os.environ.get("PINECONE_ENVIRONMENT", "gcp-starter") # e.g., 'us-west-2'
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "YOUR_OPENAI_API_KEY")

if PINECONE_API_KEY == "YOUR_PINECONE_API_KEY" or OPENAI_API_KEY == "YOUR_OPENAI_API_KEY":
    print("Warning: Please set PINECONE_API_KEY and OPENAI_API_KEY environment variables.")
    print("Quickstart will likely fail due to missing credentials.")

index_name = "my-langchain-test-index"
dimension = 1536 # OpenAI text-embedding-ada-002 model dimension
metric = "cosine"

# --- Initialize Pinecone Client (pinecone-client v3.x recommended) ---
try:
    pc = Pinecone(api_key=PINECONE_API_KEY, environment=PINECONE_ENVIRONMENT)
except Exception as e:
    print(f"Error initializing Pinecone client: {e}")
    exit(1)

# --- Create/Connect to Pinecone Index ---
if index_name not in pc.list_indexes().names():
    print(f"Creating Pinecone index '{index_name}'...")
    pc.create_index(
        name=index_name,
        dimension=dimension,
        metric=metric,
        spec=ServerlessSpec(cloud="aws", region="us-west-2") # Adjust spec as needed
    )
    print(f"Index '{index_name}' created.")
else:
    print(f"Connecting to existing Pinecone index '{index_name}'.")

# --- Initialize Embeddings Model ---
embeddings = OpenAIEmbeddings(api_key=OPENAI_API_KEY)

# --- Prepare Documents ---
documents = [
    Document(page_content="The quick brown fox jumps over the lazy dog."),
    Document(page_content="A computer is an electronic device that processes data."),
    Document(page_content="LangChain is a framework for developing applications powered by language models."),
    Document(page_content="Pinecone is a vector database for building AI applications.")
]

# --- Create or Connect to the Vector Store from Documents ---
# This method handles embedding and upserting the documents.
print("Adding documents to Pinecone vector store...")
vectorstore = PineconeVectorStore.from_documents(
    documents, embeddings, index_name=index_name
)
print("Documents added.")

# --- Perform a Similarity Search ---
query = "What is LangChain?"
print(f"\nPerforming similarity search for: '{query}'")
results = vectorstore.similarity_search(query, k=1)

print("\nSearch Results:")
for doc in results:
    print(f"- Content: {doc.page_content}")

# --- Optional: Clean up ---
# print(f"\nDeleting index '{index_name}' for cleanup...")
# pc.delete_index(index_name)
# print(f"Index '{index_name}' deleted.")

view raw JSON →