{"id":2636,"library":"opentelemetry-instrumentation-marqo","title":"OpenTelemetry Marqo Instrumentation","description":"This library provides OpenTelemetry instrumentation for the Marqo vector database Python client. It enables automatic tracing of client-side calls to Marqo, helping developers gain observability into their LLM and vector search applications. Part of the larger OpenLLMetry project, it adheres to OpenTelemetry semantic conventions for LLM and vector DB operations. The library is actively maintained, with frequent releases (multiple per month) to keep up with updates to semantic conventions and the instrumented libraries.","status":"active","version":"0.58.0","language":"en","source_language":"en","source_url":"https://github.com/traceloop/openllmetry/tree/main/packages/opentelemetry-instrumentation-marqo","tags":["opentelemetry","instrumentation","marqo","vector database","observability","tracing","LLM","AI","traceloop"],"install":[{"cmd":"pip install opentelemetry-instrumentation-marqo marqo","lang":"bash","label":"Install library and Marqo client"}],"dependencies":[{"reason":"This package instruments the 'marqo' client library; 'marqo' must be installed for instrumentation to function.","package":"marqo","optional":false},{"reason":"Core OpenTelemetry SDK components are required for trace creation and export.","package":"opentelemetry-sdk","optional":false},{"reason":"Core OpenTelemetry API components are required.","package":"opentelemetry-api","optional":false}],"imports":[{"symbol":"MarqoInstrumentor","correct":"from opentelemetry.instrumentation.marqo import MarqoInstrumentor"}],"quickstart":{"code":"import os\nfrom opentelemetry import trace\nfrom opentelemetry.sdk.resources import Resource\nfrom opentelemetry.sdk.trace import TracerProvider\nfrom opentelemetry.sdk.trace.export import ConsoleSpanExporter, BatchSpanProcessor\nfrom opentelemetry.trace import set_tracer_provider\n\n# Import the Marqo instrumentation\nfrom opentelemetry.instrumentation.marqo import MarqoInstrumentor\n\n# --- OpenTelemetry Setup (Typical for any OTel Python app) ---\n# Set up a TracerProvider\nresource = Resource.create({\"service.name\": \"my-marqo-app\"})\ntracer_provider = TracerProvider(resource=resource)\n\n# Configure a SpanProcessor to export spans to the console\n# In a real application, you would use an OTLPSpanExporter, JaegerExporter, etc.\nconsole_exporter = ConsoleSpanExporter()\nspan_processor = BatchSpanProcessor(console_exporter)\ntracer_provider.add_span_processor(span_processor)\n\n# Set the global TracerProvider\nset_tracer_provider(tracer_provider)\n\n# Get a tracer for manual spans if needed\ntracer = trace.get_tracer(__name__)\n\n# --- Marqo Instrumentation Setup ---\n# Instrument Marqo. This should ideally happen before 'marqo' is imported\n# or its client is instantiated if MarqoInstrumentor().instrument() is called.\n# For this example, we mock Marqo to ensure it's runnable without a live instance.\n\n# Mock Marqo client for demonstration purposes\nclass MockMarqoClient:\n    def index(self, index_name):\n        return MockMarqoIndex(index_name)\n\nclass MockMarqoIndex:\n    def __init__(self, index_name):\n        self.index_name = index_name\n\n    def add_documents(self, documents, tensor_fields, client_batch_size=50):\n        with tracer.start_as_current_span(f\"marqo.index.add_documents: {self.index_name}\"):\n            print(f\"Mock Marqo: Adding {len(documents)} documents to index '{self.index_name}'\")\n            # Simulate Marqo client operation\n            return {\"items\": [{\"_id\": f\"doc{i}\"} for i in range(len(documents))]}\n\n    def search(self, q, searchable_attributes=None, limit=5):\n        with tracer.start_as_current_span(f\"marqo.index.search: {self.index_name}\"):\n            print(f\"Mock Marqo: Searching for '{q}' in index '{self.index_name}'\")\n            # Simulate Marqo client operation\n            return {\"hits\": [{\"_id\": \"mock_doc_1\", \"_score\": 0.9}, {\"_id\": \"mock_doc_2\", \"_score\": 0.8}]}}\n\n# Enable Marqo instrumentation\n# Note: In a real app, MarqoInstrumentor().instrument() should be called\n# before you import the actual 'marqo' client if using programmatically.\nMarqoInstrumentor().instrument()\n\n# Simulate using the Marqo client\n# If Marqo client was imported before instrumentation, you might need to re-import or defer client creation.\nmarqo_client = MockMarqoClient() # In a real app: mq.Client(url=\"http://localhost:8882\")\n\n# Perform Marqo operations\nmy_index = marqo_client.index(\"my_test_index\")\nmy_index.add_documents(\n    documents=[\n        {\"text\": \"hello world\"},\n        {\"text\": \"another document\"}\n    ],\n    tensor_fields=[\"text\"]\n)\n\nmy_index.search(q=\"world\")\n\nprint(\"Marqo operations simulated with OpenTelemetry instrumentation.\")\n\n# Ensure all spans are processed before exiting\ntracer_provider.shutdown()","lang":"python","description":"This quickstart demonstrates how to set up OpenTelemetry with the Marqo instrumentation. It configures a console exporter to print trace data directly to the terminal, enables the Marqo instrumentor, and then simulates basic Marqo client operations. The Marqo client is mocked to make the example runnable without requiring a live Marqo instance. In a production environment, you would replace `ConsoleSpanExporter` with an appropriate exporter (e.g., `OTLPSpanExporter`) and use the actual `marqo.Client`."},"warnings":[{"fix":"Regularly review the OpenTelemetry GenAI semantic conventions documentation and `opentelemetry-instrumentation-marqo` release notes. Update dashboards and alert queries in your observability backend to reflect new attribute names or structures. Consider using OpenTelemetry Collector processors to normalize attributes if frequent changes are disruptive.","message":"The OpenTelemetry GenAI semantic conventions are actively evolving. Recent versions (e.g., 0.53.4 to 0.58.0) have seen significant updates to attribute names and structures for LLM-related telemetry, including vector DB interactions. While changes are additive and backward-compatible at the API level, the actual telemetry data (span attributes) may change, requiring adjustments in your observability backend queries or dashboards. [cite: 0.58.0 release, 0.57.0 release, 0.55.0 release, 0.54.0 release, 0.53.4 release, 8, 31]","severity":"breaking","affected_versions":"0.53.x - 0.58.x (and potentially future versions)"},{"fix":"Ensure `from opentelemetry.instrumentation.marqo import MarqoInstrumentor; MarqoInstrumentor().instrument()` is placed at the very beginning of your application's entry point, before any `import marqo` statements or `marqo.Client` instantiations.","message":"For programmatic instrumentation, the `MarqoInstrumentor().instrument()` call must be made *before* the `marqo` library or its client is imported or instantiated in your application code. If `marqo` is imported first, the instrumentation might not apply correctly.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Review the specific implementation details or environment variables (e.g., `TRACELOOP_TRACE_CONTENT=false` for the Traceloop SDK, or `OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT`) to determine how to disable or redact sensitive content logging. Implement appropriate data redaction strategies at the OpenTelemetry Collector level or within your application if necessary. Always be aware of what data your observability pipeline is collecting.","message":"Like many LLM/VectorDB instrumentations, `opentelemetry-instrumentation-marqo` (especially when used with the broader Traceloop SDK) may capture prompts, responses, or document content by default. This data could contain sensitive or personally identifiable information (PII). While the `openllmetry` repository states telemetry is only collected in the SDK, specific instrumentations can still log content to spans.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Consider running Gunicorn with a single worker (`--workers 1`) or using `uvicorn` with `UvicornWorker` for multi-process environments. If using multiple workers is essential, explore programmatic instrumentation setup within each worker process or rely on `opentelemetry-instrument` wrapper, and verify metric correctness in your observability backend.","message":"When deploying Python applications with multi-process web servers (e.g., Gunicorn with `workers > 1`), OpenTelemetry Python's automatic instrumentation (especially for metrics) can exhibit issues due to the forking model. This can lead to incomplete or incorrect telemetry data.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}