Experiment Tracking Internals

When you write mlflow.log_metric("accuracy", 0.95), here is what happens beneath the surface:

Metric Logging Pipeline

1
The fluent API checks thread-local context for an active run. If none exists, it auto-creates one.
2
The metric is validated (key: string, value: numeric, step: integer) and packaged into a protobuf message.
3
The TrackingServiceClient sends a REST request to the tracking server at MLFLOW_TRACKING_URI.
4
The server deserializes the protobuf and routes to the SqlAlchemyStore.
5
The store INSERTs into both the metrics table (full history) and the latest_metrics table (fast queries).

MLflow supports both synchronous and asynchronous logging. Async mode (enabled via MLFLOW_ENABLE_ASYNC_LOGGING=true) buffers log calls and sends them in batches, reducing network overhead for high-frequency logging.

Autologging: Zero-Code Tracking

When you call mlflow.autolog(), MLflow monkey-patches training functions of supported ML libraries to automatically capture parameters, metrics, and models.

  1. Registration: Each library has an autolog function decorated with @autologging_integration, storing its configuration globally.
  2. Patching: For scikit-learn, MLflow patches fit(), fit_transform(), and fit_predict(). For PyTorch, the training loop. For LangChain, invoke() and __call__().
  3. Interception: When the patched method runs, the wrapper creates a run, logs parameters (via get_params()), metrics (via score()), and saves the model.
  4. Safety: The safe_patch mechanism ensures autologging failures never break user code — exceptions are caught, warnings logged, and the original function proceeds.
💡
Tip: Autologging and manual logging coexist in the same run. Use autologging for framework parameters and add manual log_metric() calls for custom metrics.

Model Packaging (MLmodel Format)

When you call mlflow.sklearn.log_model(model, "model"), MLflow creates a standardized package:

  1. Serialization: The framework-specific module serializes the model (pickle for sklearn, torch.save() for PyTorch).
  2. MLmodel file: A YAML file describing flavors, signature, dependencies, and serving config.
  3. Environment capture: conda.yaml and requirements.txt with exact package versions.
  4. Upload: The entire package is uploaded to the artifact store.
YAML
# MLmodel file example
artifact_path: model
flavors:
  python_function:
    loader_module: mlflow.sklearn
    python_version: 3.10.12
    env: conda.yaml
  sklearn:
    pickled_model: model.pkl
    sklearn_version: 1.3.0
signature:
  inputs: '[{"name": "feature1", "type": "double"}]'
  outputs: '[{"type": "long"}]'

The key insight is the flavor system. The pyfunc flavor is the universal interface: any MLflow model can be loaded as a pyfunc and called with model.predict(data), regardless of the underlying framework. This is what enables framework-agnostic deployment.

Model Registry Workflow

The registry manages production models through a lifecycle state machine:

  1. Registration: mlflow.register_model() creates a RegisteredModel and a versioned ModelVersion linked to the run.
  2. Aliasing: Versions get aliases like “champion” (current best) and “challenger” (candidate). Aliases are mutable, atomic pointers.
  3. Stage Transitions: None → Staging → Production → Archived, with optional human review gates.
  4. Deployment: Load by alias: mlflow.pyfunc.load_model("models:/fraud-detector@champion"). When the alias moves, the next load picks up the new version.
⚠️
Note: The model registry requires a database-backed backend store (PostgreSQL, MySQL, etc.). File-based stores do not support model registry operations.

Deployment Paths

Local serving: mlflow models serve -m models:/my-model/1 -p 5001 starts a REST API with a /invocations endpoint.

Docker: mlflow models build-docker packages the model into a Kubernetes-ready container image.

Batch inference: Load the model in a script or Spark job and call model.predict() on a dataset.

Cloud: Integrations with AWS SageMaker, Azure ML, and Databricks for managed serving with auto-scaling.