MLServer MLflow Runtime
mlserver-mlflow provides an MLflow runtime for MLServer, allowing users to serve models logged with MLflow using the MLServer inference server. It's currently at version 1.7.1 and maintains a release cadence aligned with MLServer's development, receiving updates for bug fixes and compatibility with new MLflow/MLServer versions.
Common errors
-
ModuleNotFoundError: No module named 'mlserver_mlflow'
cause The `mlserver-mlflow` package is not installed in the Python environment where MLServer is being run, or the environment is not correctly activated.fixInstall the package: `pip install mlserver-mlflow` -
mlserver.errors.ModelLoadingError: Failed to load model 'my-model': No module named 'scikit-learn'
cause The MLflow model being loaded requires a specific Python package (e.g., `scikit-learn`, `xgboost`, `tensorflow`) that is not installed in the `mlserver-mlflow` serving environment.fixIdentify and install the missing dependency. For example, `pip install scikit-learn`. For comprehensive dependency management, ensure your deployment environment matches the MLflow model's `conda_env` or `pip_requirements`. -
mlserver.errors.ModelLoadingError: Failed to load model 'my-model': No MLflow model found at URI: 'models:/my-model/Production'
cause The specified MLflow model URI is incorrect, the model does not exist at the given URI, or the MLflow tracking server/registry is not accessible from the MLServer instance.fixDouble-check the `uri` in your `ModelSettings` (or `model-settings.json`). Verify the model's existence in your MLflow Tracking Server or file system. Ensure network connectivity to the MLflow Tracking Server if using remote URIs.
Warnings
- breaking MLServer 0.x to 1.x API changes directly impact `mlserver-mlflow` users. If migrating from older MLServer versions, you'll need to update your `model-settings.json` configuration, `ModelSettings` objects, and client inference request/response structures to align with MLServer 1.x's API.
- gotcha Missing dependencies for MLflow models are a common source of errors. MLflow models, especially `pyfunc` types, often define `conda_env` or `pip_requirements`. If these dependencies (e.g., `xgboost`, `tensorflow`, custom packages) are not installed in the environment where `mlserver-mlflow` is running, model loading will fail with `ModuleNotFoundError` or similar.
- gotcha Incorrect or inaccessible MLflow model URIs lead to 'model not found' errors. Users sometimes confuse different URI formats (e.g., run-relative artifact URIs, MLflow Model Registry URIs, local file paths).
Install
-
pip install mlserver-mlflow
Imports
- MLflowRuntime
from mlserver_mlflow import MLflowRuntime
Quickstart
import os
import tempfile
import mlflow
import mlflow.sklearn
from sklearn.linear_model import LogisticRegression
import numpy as np
import asyncio
from mlserver_mlflow import MLflowRuntime
from mlserver.settings import ModelSettings
from mlserver.types import InferenceRequest, RequestInput
# 1. Create a dummy MLflow model and log it locally
# (In a real scenario, this model would already be logged)
temp_dir = tempfile.TemporaryDirectory()
model_base_path = os.path.join(temp_dir.name, "mlflow_models")
mlflow.set_tracking_uri(f"file://{model_base_path}/mlruns")
with mlflow.start_run():
model = LogisticRegression()
model.fit(np.array([[0,0],[1,1]]), np.array([0,1]))
mlflow.sklearn.log_model(model, "model_artifact")
model_uri = f"file://{mlflow.active_run().info.artifact_uri}/model_artifact"
# 2. Instantiate and load MLflowRuntime
async def main():
model_settings = ModelSettings(
name="my-mlflow-model",
implementation="mlserver_mlflow.MLflowRuntime",
parameters={
"uri": model_uri
}
)
mlflow_runtime = MLflowRuntime(model_settings)
await mlflow_runtime.load()
# 3. Prepare and send inference request
request_input = RequestInput(
name="predict",
shape=[1, 2],
datatype="FP32",
data=[[0.5, 0.5]]
)
inference_request = InferenceRequest(inputs=[request_input])
response = await mlflow_runtime.predict(inference_request)
print("Prediction:", response.outputs[0].data)
await mlflow_runtime.unload()
temp_dir.cleanup() # Clean up temporary model files
asyncio.run(main())