OpenTelemetry Haystack Instrumentation

0.58.0 · active · verified Fri Apr 10

The `opentelemetry-instrumentation-haystack` library provides OpenTelemetry tracing capabilities for applications using Haystack (haystack-ai). It automatically captures spans for various Haystack operations, including pipeline execution, document store interactions, and LLM calls, enriching them with relevant attributes. As part of the `openllmetry` project, it follows its rapid release cadence, with version `0.58.0` being the latest, frequently updating to support new Haystack features and OpenTelemetry semantic conventions.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to enable OpenTelemetry tracing for a simple Haystack pipeline. It initializes an OpenTelemetry `TracerProvider` with a `ConsoleSpanExporter` (for printing traces to console), instruments Haystack using `HaystackInstrumentor().instrument()`, and then constructs and runs a basic Haystack RAG pipeline. Replace `ConsoleSpanExporter` with `OTLPSpanExporter` for sending traces to a collector. Ensure `haystack-ai` and `openai` are installed and `OPENAI_API_KEY` is set in your environment for the generator to function.

import os
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, SimpleSpanProcessor
from opentelemetry.instrumentation.haystack import HaystackInstrumentor

# 1. Initialize OpenTelemetry TracerProvider
resource = Resource.create({"service.name": "haystack-app"})
provider = TracerProvider(resource=resource)
# Use ConsoleSpanExporter for demonstration; replace with OTLPSpanExporter for production
provider.add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)

# 2. Instrument Haystack
HaystackInstrumentor().instrument()

# 3. Haystack Usage (requires 'haystack-ai' and 'openai' to be installed)
try:
    from haystack_ai.components.generators import OpenAIGenerator
    from haystack_ai.components.builders.prompt_builder import PromptBuilder
    from haystack_ai.components.pipelines import Pipeline
    from haystack_ai.components.retrievers.in_memory import InMemoryBM25Retriever
    from haystack_ai.document_stores.in_memory import InMemoryDocumentStore
    from haystack_ai.schema import Document

    # Set OpenAI API key for the generator. For local execution, ensure this is set.
    # For a real application, retrieve from a secure source.
    if not os.environ.get("OPENAI_API_KEY"): # Using os.environ.get for robustness
        print("Warning: OPENAI_API_KEY not set. OpenAIGenerator might fail or use a default.")

    # Create a simple document store and add documents
    document_store = InMemoryDocumentStore()
    documents = [
        Document(content="The quick brown fox jumps over the lazy dog."),
        Document(content="Haystack is an open-source framework for building custom LLM applications."),
        Document(content="OpenTelemetry provides observability for your applications.")
    ]
    document_store.write_documents(documents)

    # Create a Haystack pipeline
    pipeline = Pipeline()
    pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
    pipeline.add_component("prompt_builder", PromptBuilder(template="Answer the question: {{query}}\nContext: {{documents}}"))
    pipeline.add_component("generator", OpenAIGenerator(model="gpt-3.5-turbo"))

    pipeline.connect("retriever", "prompt_builder.documents")
    pipeline.connect("prompt_builder", "generator.prompt")

    # Run the pipeline
    query = "What is Haystack?"
    print(f"\nRunning Haystack pipeline with query: '{query}'")
    result = pipeline.run(data={
        "query": query,
        "retriever": {"query": query}
    })

    print("\nHaystack Pipeline Result:")
    print(result)

except ImportError as e:
    print(f"\nSkipping Haystack example due to missing dependencies: {e}")
    print("Please install 'haystack-ai' and 'openai': pip install haystack-ai openai")
except Exception as e:
    print(f"\nAn error occurred during Haystack pipeline execution: {e}")
    print("Ensure OPENAI_API_KEY is set if using OpenAIGenerator and your network is accessible.")

# 4. Shut down the provider to ensure all spans are exported
provider.shutdown()

view raw JSON →