OpenTelemetry LlamaIndex Instrumentation
This library provides OpenTelemetry tracing for applications built with LlamaIndex. It allows developers to observe the full lifecycle of LLM-based applications, including RAG pipelines, agents, and underlying LLM calls, by generating OpenTelemetry-compliant spans. The project is actively maintained with frequent releases, often aligning with the evolving OpenTelemetry GenAI semantic conventions.
Warnings
- breaking The OpenTelemetry GenAI semantic conventions are actively evolving. Recent versions (0.53.x and later) of `opentelemetry-instrumentation-llamaindex` have migrated span attributes to align with these newer conventions (e.g., OpenTelemetry GenAI Semantic Conventions 0.5.0).
- gotcha This instrumentation specifically targets the LlamaIndex library. It does not automatically instrument all underlying LLM calls (e.g., directly made `openai` or `anthropic` client calls outside of LlamaIndex's abstraction).
- gotcha By default, the LlamaIndex instrumentation captures and logs sensitive data such as prompts, completions, and embedding inputs/outputs as span attributes. This data will be visible in your tracing backend.
- gotcha Simply calling `LlamaIndexInstrumentor().instrument()` is insufficient for traces to be collected and exported. A full OpenTelemetry SDK setup, including a `TracerProvider`, `SpanProcessor`, and `SpanExporter`, must be configured and registered.
Install
-
pip install opentelemetry-instrumentation-llamaindex llama-index-core openai
Imports
- LlamaIndexInstrumentor
from opentelemetry.instrumentation.llamaindex import LlamaIndexInstrumentor
Quickstart
import os
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
ConsoleSpanExporter,
SimpleSpanProcessor
)
from opentelemetry.instrumentation.llamaindex import LlamaIndexInstrumentor
# For the LlamaIndex example
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI
# --- OpenTelemetry Setup (for console output) ---
# Resource for your service
resource = Resource.create({"service.name": "llamaindex-app"})
# Configure TracerProvider
provider = TracerProvider(resource=resource)
trace.set_tracer_provider(provider)
# Configure Span Exporter to print traces to console
exporter = ConsoleSpanExporter()
span_processor = SimpleSpanProcessor(exporter)
provider.add_span_processor(span_processor)
# --- Instrument LlamaIndex ---
LlamaIndexInstrumentor().instrument()
print("LlamaIndex instrumentation enabled.")
# --- LlamaIndex Application Example ---
# Ensure OpenAI API key is set for the example
# Replace with your actual key or set as an environment variable
os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY", "sk-YOUR_OPENAI_API_KEY")
# Create a dummy document for LlamaIndex
dummy_data_dir = "./data"
os.makedirs(dummy_data_dir, exist_ok=True)
with open(os.path.join(dummy_data_dir, "test_doc.txt"), "w") as f:
f.write("The quick brown fox jumps over the lazy dog. LlamaIndex is great for RAG applications.")
# Load documents and create an index
documents = SimpleDirectoryReader(dummy_data_dir).load_data()
llm = OpenAI(model="gpt-3.5-turbo")
index = VectorStoreIndex.from_documents(documents, llm=llm)
query_engine = index.as_query_engine(llm=llm)
print("\nPerforming LlamaIndex query...")
response = query_engine.query("What is LlamaIndex good for?")
print(f"LlamaIndex Response: {response}")
print("\nTraces should be visible in the console.")
# Clean up dummy data (optional)
# import shutil
# if os.path.exists(dummy_data_dir):
# shutil.rmtree(dummy_data_dir)