OpenTelemetry Ollama Instrumentation

0.58.0 · active · verified Thu Apr 09

This library provides OpenTelemetry instrumentation for tracing calls to Ollama's endpoints made with the official Ollama Python Library. It is part of the OpenLLMetry project and sees frequent releases, often multiple times a month, reflecting active development and continuous alignment with evolving OpenTelemetry semantic conventions for AI applications.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to set up a basic OpenTelemetry `TracerProvider` that exports traces to the console, and then instruments the `ollama` client to automatically capture traces for chat and generate operations. Ensure you have the `ollama` client library installed and an Ollama server running locally with the `llama2` model pulled.

import os
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, BatchSpanProcessor
from opentelemetry.instrumentation.ollama import OllamaInstrumentor
import ollama

# 1. Set up OpenTelemetry Tracer Provider
# This simple setup exports traces to the console.
resource = Resource.create(attributes={"service.name": "ollama-app"})
provider = TracerProvider(resource=resource)
processor = BatchSpanProcessor(ConsoleSpanExporter())
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

# 2. Instrument Ollama
OllamaInstrumentor().instrument()

# 3. Use Ollama client (ensure ollama server is running and model pulled)
try:
    print("\n--- Making an Ollama chat request ---")
    chat_response = ollama.chat(
        model='llama2',
        messages=[{'role': 'user', 'content': 'Why is the sky blue?'}]
    )
    print("Ollama Chat Response:")
    print(chat_response['message']['content'])
    print("\n--- Trace for chat request should be visible above ---")

    print("\n--- Making an Ollama generate request (streaming) ---")
    stream = ollama.generate(
        model='llama2',
        prompt='Tell me a short story about a space-faring cat.',
        stream=True
    )
    full_response = ""
    print("Streaming Ollama Response:")
    for chunk in stream:
        if 'response' in chunk:
            full_response += chunk['response']
            print(chunk['response'], end='', flush=True)
    print("\n\n--- Trace for streaming request should be visible above ---")

except Exception as e:
    print(f"\nError interacting with Ollama: {e}")
    print("Please ensure the Ollama server is running (e.g., `ollama serve`) ")
    print("and the 'llama2' model is pulled (e.g., `ollama pull llama2`).")

view raw JSON →