OpenTelemetry Haystack Instrumentation
The `opentelemetry-instrumentation-haystack` library provides OpenTelemetry tracing capabilities for applications using Haystack (haystack-ai). It automatically captures spans for various Haystack operations, including pipeline execution, document store interactions, and LLM calls, enriching them with relevant attributes. As part of the `openllmetry` project, it follows its rapid release cadence, with version `0.58.0` being the latest, frequently updating to support new Haystack features and OpenTelemetry semantic conventions.
Warnings
- breaking Starting from versions around 0.53.0 and continuing through 0.58.0, this instrumentation package transitioned to align with the OpenTelemetry Generative AI Semantic Conventions (GenAI SemConv). This introduces significant changes to the names and structure of span attributes related to LLM operations.
- gotcha This instrumentation targets the newer `haystack-ai` package, not the legacy `farm-haystack`. If you are using `farm-haystack`, you might experience issues or missing instrumentation for newer features.
- gotcha The `HaystackInstrumentor().instrument()` call must occur before any Haystack components (e.g., `Pipeline`, `DocumentStore`, `Generator`) are initialized or used. If Haystack objects are created before instrumentation is enabled, their operations will not be traced.
Install
-
pip install opentelemetry-instrumentation-haystack haystack-ai openai
Imports
- HaystackInstrumentor
from opentelemetry.instrumentation.haystack import HaystackInstrumentor
Quickstart
import os
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, SimpleSpanProcessor
from opentelemetry.instrumentation.haystack import HaystackInstrumentor
# 1. Initialize OpenTelemetry TracerProvider
resource = Resource.create({"service.name": "haystack-app"})
provider = TracerProvider(resource=resource)
# Use ConsoleSpanExporter for demonstration; replace with OTLPSpanExporter for production
provider.add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)
# 2. Instrument Haystack
HaystackInstrumentor().instrument()
# 3. Haystack Usage (requires 'haystack-ai' and 'openai' to be installed)
try:
from haystack_ai.components.generators import OpenAIGenerator
from haystack_ai.components.builders.prompt_builder import PromptBuilder
from haystack_ai.components.pipelines import Pipeline
from haystack_ai.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack_ai.document_stores.in_memory import InMemoryDocumentStore
from haystack_ai.schema import Document
# Set OpenAI API key for the generator. For local execution, ensure this is set.
# For a real application, retrieve from a secure source.
if not os.environ.get("OPENAI_API_KEY"): # Using os.environ.get for robustness
print("Warning: OPENAI_API_KEY not set. OpenAIGenerator might fail or use a default.")
# Create a simple document store and add documents
document_store = InMemoryDocumentStore()
documents = [
Document(content="The quick brown fox jumps over the lazy dog."),
Document(content="Haystack is an open-source framework for building custom LLM applications."),
Document(content="OpenTelemetry provides observability for your applications.")
]
document_store.write_documents(documents)
# Create a Haystack pipeline
pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipeline.add_component("prompt_builder", PromptBuilder(template="Answer the question: {{query}}\nContext: {{documents}}"))
pipeline.add_component("generator", OpenAIGenerator(model="gpt-3.5-turbo"))
pipeline.connect("retriever", "prompt_builder.documents")
pipeline.connect("prompt_builder", "generator.prompt")
# Run the pipeline
query = "What is Haystack?"
print(f"\nRunning Haystack pipeline with query: '{query}'")
result = pipeline.run(data={
"query": query,
"retriever": {"query": query}
})
print("\nHaystack Pipeline Result:")
print(result)
except ImportError as e:
print(f"\nSkipping Haystack example due to missing dependencies: {e}")
print("Please install 'haystack-ai' and 'openai': pip install haystack-ai openai")
except Exception as e:
print(f"\nAn error occurred during Haystack pipeline execution: {e}")
print("Ensure OPENAI_API_KEY is set if using OpenAIGenerator and your network is accessible.")
# 4. Shut down the provider to ensure all spans are exported
provider.shutdown()