Experiment Tracking Internals
When you write mlflow.log_metric("accuracy", 0.95), here is what happens beneath the surface:
Metric Logging Pipeline
TrackingServiceClient sends a REST request to the tracking server at MLFLOW_TRACKING_URI.SqlAlchemyStore.metrics table (full history) and the latest_metrics table (fast queries).MLflow supports both synchronous and asynchronous logging. Async mode (enabled via MLFLOW_ENABLE_ASYNC_LOGGING=true) buffers log calls and sends them in batches, reducing network overhead for high-frequency logging.
Autologging: Zero-Code Tracking
When you call mlflow.autolog(), MLflow monkey-patches training functions of supported ML libraries to automatically capture parameters, metrics, and models.
- Registration: Each library has an autolog function decorated with
@autologging_integration, storing its configuration globally. - Patching: For scikit-learn, MLflow patches
fit(),fit_transform(), andfit_predict(). For PyTorch, the training loop. For LangChain,invoke()and__call__(). - Interception: When the patched method runs, the wrapper creates a run, logs parameters (via
get_params()), metrics (viascore()), and saves the model. - Safety: The
safe_patchmechanism ensures autologging failures never break user code — exceptions are caught, warnings logged, and the original function proceeds.
log_metric() calls for custom metrics.Model Packaging (MLmodel Format)
When you call mlflow.sklearn.log_model(model, "model"), MLflow creates a standardized package:
- Serialization: The framework-specific module serializes the model (pickle for sklearn,
torch.save()for PyTorch). - MLmodel file: A YAML file describing flavors, signature, dependencies, and serving config.
- Environment capture:
conda.yamlandrequirements.txtwith exact package versions. - Upload: The entire package is uploaded to the artifact store.
# MLmodel file example
artifact_path: model
flavors:
python_function:
loader_module: mlflow.sklearn
python_version: 3.10.12
env: conda.yaml
sklearn:
pickled_model: model.pkl
sklearn_version: 1.3.0
signature:
inputs: '[{"name": "feature1", "type": "double"}]'
outputs: '[{"type": "long"}]'
The key insight is the flavor system. The pyfunc flavor is the universal interface: any MLflow model can be loaded as a pyfunc and called with model.predict(data), regardless of the underlying framework. This is what enables framework-agnostic deployment.
Model Registry Workflow
The registry manages production models through a lifecycle state machine:
- Registration:
mlflow.register_model()creates aRegisteredModeland a versionedModelVersionlinked to the run. - Aliasing: Versions get aliases like “champion” (current best) and “challenger” (candidate). Aliases are mutable, atomic pointers.
- Stage Transitions: None → Staging → Production → Archived, with optional human review gates.
- Deployment: Load by alias:
mlflow.pyfunc.load_model("models:/fraud-detector@champion"). When the alias moves, the next load picks up the new version.
Deployment Paths
Local serving: mlflow models serve -m models:/my-model/1 -p 5001 starts a REST API with a /invocations endpoint.
Docker: mlflow models build-docker packages the model into a Kubernetes-ready container image.
Batch inference: Load the model in a script or Spark job and call model.predict() on a dataset.
Cloud: Integrations with AWS SageMaker, Azure ML, and Databricks for managed serving with auto-scaling.