MLflow
MLflow is an open-source platform designed to manage the entire machine learning lifecycle, encompassing experiment tracking, reproducible projects, model management, and deployment. The current stable version is 3.10.1, with frequent updates including patch, minor, and major releases that introduce new features and breaking changes. [9, 16]
Warnings
- breaking MLflow 3.x introduced significant breaking changes, including the complete removal of MLflow Recipes. Many model flavors (fastai, mleap, diviner) are no longer supported directly. The 'routes' and 'route_type' config keys for AI Gateway were removed. The deployment server and `start-server` CLI command have been removed, replaced by `mlflow models serve` or containerized deployments. [1, 3]
- breaking In MLflow 3.x, the `run_uuid` attribute on `RunInfo` objects has been removed and replaced by `run_id`. Additionally, several Git-related run tags (`mlflow.gitBranchName`, `mlflow.gitRepoURL`) were removed. [1]
- breaking The Artifacts tab in the MLflow UI for runs no longer displays model artifacts in MLflow 3.x. Model artifacts are now accessed through a dedicated 'Logged Models' page. [1]
- deprecated Parameters like `example_no_conversion` and `code_path` have been removed from model logging/saving APIs. `requirements_file` for PyTorch flavor is removed, and `inference_config` from Transformers flavor is also removed. [1]
- gotcha When upgrading a self-hosted MLflow server, it is crucial to stop the server, upgrade the package, run database migrations using `mlflow db upgrade <backend-store-url>`, and then restart the server. MLflow does not natively support live upgrades, and schema migrations can be slow and non-transactional. Always back up your database before migration. [16]
- gotcha MLflow clients and servers work best when they are on the same version. While a newer server is generally backward compatible with older clients for basic logging, using newer client features (e.g., MLflow Tracing) with an older server might lead to missing endpoints or unexpected behavior. [16]
Install
-
pip install mlflow
Imports
- mlflow
import mlflow
- MlflowClient
from mlflow.tracking import MlflowClient
- infer_signature
from mlflow.models import infer_signature
Quickstart
import mlflow
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from mlflow.models import infer_signature
# Set MLflow tracking URI (optional, defaults to local ./mlruns)
# For a local server, run 'mlflow ui' in your terminal and point to http://127.0.0.1:5000
# os.environ['MLFLOW_TRACKING_URI'] = os.environ.get('MLFLOW_TRACKING_URI', 'http://127.0.0.1:5000')
mlflow.set_experiment("MLflow_Quickstart_Experiment")
# Load the Iris dataset
X, y = datasets.load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define model hyperparameters
params = {"solver": "lbfgs", "max_iter": 1000, "multi_class": "auto", "random_state": 8888}
with mlflow.start_run():
# Log hyperparameters
mlflow.log_params(params)
# Train the model
lr = LogisticRegression(**params)
lr.fit(X_train, y_train)
# Make predictions and calculate metrics
y_pred = lr.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
mlflow.log_metric("accuracy", accuracy)
# Infer model signature
predictions = lr.predict(X_train) # Use training data for signature inference
signature = infer_signature(X_train, predictions)
# Log the model
mlflow.sklearn.log_model(
sk_model=lr,
artifact_path="logistic_regression_model",
signature=signature,
registered_model_name="IrisLogisticRegression"
)
print(f"Logged model with accuracy: {accuracy}")
print(f"View runs in MLflow UI: run 'mlflow ui' in your terminal and navigate to http://127.0.0.1:5000")