Cohere Rerank Postprocessor for LlamaIndex

0.8.0 · active · verified Thu Apr 16

The llama-index-postprocessor-cohere-rerank library provides an integration for using Cohere's Rerank API within LlamaIndex, a data framework for LLM applications. This postprocessor is designed to enhance the relevance of retrieved documents in Retrieval-Augmented Generation (RAG) pipelines by re-ranking them based on semantic relevance. It is part of the broader LlamaIndex ecosystem, which typically sees frequent updates to its integration packages.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to set up and use the CohereRerank postprocessor with a LlamaIndex VectorStoreIndex. It involves installing the necessary packages, setting the COHERE_API_KEY, loading some example data, building an index, and then configuring the query engine to use CohereRerank to refine the retrieval results.

import os
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.postprocessor.cohere_rerank import CohereRerank
from llama_index.readers.web import SimpleWebPageReader

# Set your Cohere API key
os.environ["COHERE_API_KEY"] = os.environ.get("COHERE_API_KEY", "your_cohere_api_key")

# Load documents (using a dummy example for demonstration)
documents = SimpleWebPageReader(html_to_text=True).load_data(
    ["https://docs.cohere.com/v2/docs/cohere-embed"]
)

# Build a VectorStoreIndex
index = VectorStoreIndex.from_documents(documents=documents)

# Initialize the Cohere Rerank postprocessor
cohere_rerank = CohereRerank(
    api_key=os.environ["COHERE_API_KEY"],
    model="rerank-english-v3.0", # Use an appropriate Cohere rerank model
    top_n=2 # Number of top results to return after reranking
)

# Create a query engine with the reranker
query_engine = index.as_query_engine(
    similarity_top_k=10, # Retrieve more nodes initially for reranking
    node_postprocessors=[cohere_rerank]
)

# Query the index
response = query_engine.query("What is Cohere's Embed Model?")

print(response.response)
# To see source nodes if needed
# from llama_index.core.response.pprint_utils import pprint_response
# pprint_response(response, show_source=True)

view raw JSON →