MLflow
An open-source platform for managing the end-to-end machine learning lifecycle, including experiment tracking, model packaging, a model registry, and deployment.
MLflow is an open-source platform that manages the complete lifecycle of machine learning workflows, from experimentation to deployment and monitoring. It was originally developed at Databricks and released in 2018, and is now governed as a Linux Foundation AI & Data project. MLflow is framework-agnostic: it integrates with PyTorch, TensorFlow, scikit-learn, XGBoost, Hugging Face Transformers, LangChain, and most other widely used libraries. Its design goal is to provide a small set of orthogonal tools that data scientists and platform teams can adopt independently without committing to a single vendor stack.
Core components
MLflow is organised around four loosely coupled components that can be used together or in isolation.
MLflow Tracking records parameters, metrics, code versions, and arbitrary files produced by training runs. Tracking data is stored in a backend (SQLite, PostgreSQL, MySQL, or a managed equivalent) and an artefact store (local filesystem, S3, GCS, Azure Blob, or HDFS). A web UI allows users to compare runs, plot metrics, and reproduce experiments.
MLflow Projects define reproducible, environment-pinned packaging of ML code using a simple YAML descriptor. Projects can be executed locally, on Kubernetes, or on Databricks with a single command, and dependencies are reconstructed through conda, virtualenv, or Docker.
MLflow Models is a standard format for saving models with their dependencies and serving signatures. A model saved with MLflow can be loaded by any framework that understands its flavour and deployed to Docker, Kubernetes, Azure ML, Amazon SageMaker, or a local REST server.
MLflow Model Registry provides versioning, stage transitions (Staging, Production, Archived), approval workflows, and webhook events for registered models. The registry is the source of truth that platform teams typically integrate with their CI/CD pipelines.
Generative AI features
MLflow 3, released in 2025, expanded the platform substantially to cover generative AI workloads. New capabilities include trace capture for LLM applications built on OpenTelemetry, prompt versioning with full lineage, automated evaluation harnesses for chatbots and retrieval-augmented generation systems, and integration with agent frameworks such as LangChain, LlamaIndex, and the OpenAI Agents SDK. MLflow now treats prompts, datasets, and evaluation runs as first-class registered artefacts, mirroring the discipline previously applied only to trained models.
Comparison with alternatives
| Tool | Hosted option | Open source | Strengths | |---|---|---|---| | MLflow | Databricks, self-host | Yes (Apache 2.0) | Broad framework support, registry, LLM tracing | | Weights & Biases | Hosted, on-prem | Partial | Polished UI, strong collaboration features | | Neptune.ai | Hosted, on-prem | Client only | Granular metadata, large-scale experiment search | | Comet ML | Hosted, on-prem | Client only | Production monitoring, model debugging |
MLflow is generally chosen when teams require an entirely self-hosted solution or want to avoid per-user licensing. The trade-off is a less opinionated UI and somewhat more setup work compared to fully managed services.
Deployment patterns
Most Malaysian and regional teams adopt MLflow in one of three patterns: as a managed service inside Databricks, as a self-hosted tracking server on a Kubernetes cluster behind a corporate VPN, or as a lightweight per-team installation backed by Postgres and S3-compatible object storage such as MinIO. Larger organisations often pair MLflow with Airflow, Prefect, or Argo Workflows for orchestration, and with Seldon, BentoML, or KServe for serving.
References
- Zaharia, M. et al. (2018). Accelerating the Machine Learning Lifecycle with MLflow. IEEE Data Engineering Bulletin.
- MLflow Project. (2025). MLflow 3 Documentation. mlflow.org.
- Linux Foundation. (2024). MLflow Joins LF AI & Data. lfaidata.foundation.
- Sparity. (2025). MLflow in 2025: The New Backbone of Enterprise MLOps. sparity.com.