{"id":8382,"library":"opentelemetry-instrumentation-sklearn","title":"OpenTelemetry Scikit-learn Instrumentation","description":"This library provides OpenTelemetry automatic instrumentation for the scikit-learn (sklearn) machine learning library. It enables the collection of telemetry data, such as traces and spans, for various scikit-learn operations like model training (`fit`) and prediction (`predict`). The project is actively maintained as part of the broader OpenTelemetry Python Contrib repository, with new versions released regularly as beta releases.","status":"active","version":"0.46b0","language":"en","source_language":"en","source_url":"https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation/opentelemetry-instrumentation-sklearn","tags":["opentelemetry","instrumentation","sklearn","scikit-learn","observability","tracing","machine-learning","ml"],"install":[{"cmd":"pip install opentelemetry-instrumentation-sklearn","lang":"bash","label":"Install core instrumentation"},{"cmd":"pip install 'opentelemetry-distro[otlp]' opentelemetry-instrumentation-sklearn","lang":"bash","label":"Install with OTLP exporter and distro"}],"dependencies":[{"reason":"The library instruments scikit-learn; requires an installed version of scikit-learn to function.","package":"scikit-learn","optional":false},{"reason":"Core OpenTelemetry SDK for trace/metric/log providers.","package":"opentelemetry-sdk","optional":false}],"imports":[{"symbol":"SklearnInstrumentor","correct":"from opentelemetry.instrumentation.sklearn import SklearnInstrumentor"}],"quickstart":{"code":"from opentelemetry import trace\nfrom opentelemetry.sdk.resources import Resource\nfrom opentelemetry.sdk.trace import TracerProvider\nfrom opentelemetry.sdk.trace.export import ConsoleSpanExporter, SimpleSpanProcessor\nfrom opentelemetry.instrumentation.sklearn import SklearnInstrumentor\n\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.datasets import load_iris\nfrom sklearn.model_selection import train_test_split\n\n# Configure OpenTelemetry Tracer\nresource = Resource.create({\"service.name\": \"sklearn-app\"})\nprovider = TracerProvider(resource=resource)\nprocessor = SimpleSpanProcessor(ConsoleSpanExporter())\nprovider.add_span_processor(processor)\ntrace.set_tracer_provider(provider)\n\n# Initialize Sklearn Instrumentation\n# Ensure this is called BEFORE importing sklearn if using programmatic instrumentation\nSklearnInstrumentor().instrument()\n\n# Scikit-learn operations will now be traced\niris = load_iris()\nX, y = iris.data, iris.target\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n\nmodel = LogisticRegression(max_iter=200)\n\nprint(\"\\n--- Training Model ---\")\nmodel.fit(X_train, y_train)\nprint(\"Model training complete.\")\n\nprint(\"\\n--- Making Predictions ---\")\npredictions = model.predict(X_test)\nprint(\"Predictions made.\")\n","lang":"python","description":"This quickstart demonstrates how to instrument scikit-learn operations. It sets up a basic OpenTelemetry ConsoleSpanExporter to print traces to the console, initializes the `SklearnInstrumentor`, and then performs typical scikit-learn `fit` and `predict` operations. You should see spans generated for these activities in your console output."},"warnings":[{"fix":"Call `SklearnInstrumentor().instrument()` at the very beginning of your application's entry point, before any `import sklearn` statements or direct usage of scikit-learn objects.","message":"The OpenTelemetry instrumentation should be initialized before the `sklearn` library is imported to ensure proper monkey-patching and tracing of operations. Importing `sklearn` components before calling `SklearnInstrumentor().instrument()` may result in untraced operations.","severity":"gotcha","affected_versions":"All"},{"fix":"Ensure `scikit-learn` is installed and meets the version requirements of the `opentelemetry-instrumentation-sklearn` package. Review the `instrumentation_dependencies()` method in the source code or the OpenTelemetry documentation for precise version constraints.","message":"A change in OpenTelemetry Python Contrib (around v0.53b0 / 1.32.0) altered how dependency checks are performed. Instrumentors now check for the instrumented library's presence and version *inside* the `instrument()` method. If the target library (scikit-learn in this case) is not installed, or its version is incompatible, `instrument()` may raise an `ImportError` or other exceptions.","severity":"breaking","affected_versions":">=0.53b0 of opentelemetry-instrumentation (parent package), >=1.32.0 of opentelemetry-sdk"},{"fix":"Ensure only one instance of each OpenTelemetry exporter and processor is configured per telemetry signal (traces, metrics, logs) within your application's lifecycle. For pre-fork servers, consider programmatic auto-instrumentation or using a single worker for telemetry-sensitive operations to avoid issues with background threads and locks.","message":"Running multiple OpenTelemetry SDK components (e.g., multiple exporters or processors) can lead to duplicate telemetry. This is especially problematic in environments like 'Always On' Azure Functions or applications using pre-fork servers where processes might persist or get duplicated.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install scikit-learn: `pip install scikit-learn`","cause":"The `opentelemetry-instrumentation-sklearn` package requires `scikit-learn` to be installed, but it is missing from the environment.","error":"ModuleNotFoundError: No module named 'sklearn'"},{"fix":"Ensure `SklearnInstrumentor().instrument()` is called as early as possible in your application's startup, ideally before any `import sklearn` statements. Also, verify that a `TracerProvider` and `SpanProcessor` are correctly configured and set as the global trace provider.","cause":"The `SklearnInstrumentor().instrument()` call was made after `sklearn` modules or objects were already imported and initialized, or OpenTelemetry SDK was not properly configured.","error":"Sklearn operations are not being traced/no spans are generated."},{"fix":"Adjust your `scikit-learn` version to be within the compatible range specified by `opentelemetry-instrumentation-sklearn` (e.g., `pip install 'scikit-learn<1.4.0'` or `pip install 'scikit-learn>=0.24.0,<1.4.0'` for this example).","cause":"A version conflict exists between the installed `scikit-learn` and the versions supported by `opentelemetry-instrumentation-sklearn`.","error":"ERROR: opentelemetry-instrumentation-sklearn 0.46b0 requires scikit-learn>=0.24.0,<1.4.0, but you have scikit-learn 1.4.1 which is incompatible."}]}