AIWiki
Malaysia

Model Registry

A model registry is a centralised system that catalogues, versions, and governs trained machine learning models throughout their lifecycle, supporting reproducibility, deployment, and compliance.

5 min readLast updated June 2026Infrastructure

A model registry is a centralised system of record for trained machine learning models within an organisation. It stores model artefacts (weights, code, configuration), records associated metadata (training data version, hyperparameters, metrics, lineage), assigns version numbers and aliases, manages lifecycle stages (development, staging, production, archived), and integrates with deployment pipelines to push approved models to serving infrastructure. The registry is one of the foundational components of any mature MLOps platform, alongside the feature store, training pipeline, and serving layer.

Why model registries exist

Without a registry, organisations typically discover three problems as they scale machine learning. First, reproducibility fails: nobody can reliably identify which model is running where, how it was trained, or which data it learned from. Second, governance fails: it becomes impossible to demonstrate to auditors, regulators, or internal risk committees that a particular model met approval criteria before going live. Third, deployment fails: hand-offs between data scientists and ML engineers are brittle, ad-hoc, and slow. A registry addresses all three by giving the model artefact a stable identity, a metadata record, and a governed promotion path.

Core capabilities

| Capability | What it provides | |---|---| | Artefact storage | Binary weights, container images, framework files | | Versioning | Immutable version numbers; semantic version aliases | | Metadata | Training run ID, dataset version, framework, metrics | | Lineage | Linkage to data sources, features, code commits | | Stage management | Dev, staging, production, archived; promotion workflows | | Access control | Per-team and per-environment permissions | | Tagging and search | Annotate models with custom attributes for discovery | | Approval workflows | Required sign-offs before promotion to production | | Model cards | Documented intended use, limitations, risk class | | APIs and webhooks | Programmatic deployment and CI/CD integration |

Common implementations

The most widely adopted open-source registry is MLflow Model Registry, maintained by Databricks and shipped both as part of the MLflow project and as a managed service. Major cloud providers offer integrated registries: Amazon SageMaker Model Registry, Google Vertex AI Model Registry, Azure Machine Learning Registry, and Databricks Unity Catalog Models. Specialist vendors such as Weights and Biases, Neptune.ai, Comet ML, and ClearML provide registries inside their experiment tracking platforms. Many organisations also build internal registries over object storage and metadata databases when their requirements diverge from off-the-shelf products.

Relationship to other MLOps components

The registry sits between two adjacent systems. Experiment tracking captures the noisy, exploratory training history; only the runs deemed worth keeping are promoted into the registry as named, versioned models. Model serving consumes from the registry: a serving system pulls the artefact tagged "production" for a given model name and exposes it as an API, a batch job, or an edge deployment. A well-instrumented registry also integrates with monitoring so that drift, data quality, and performance metrics are linked back to a specific model version, and with feature stores so that input expectations are documented and enforced.

Governance and compliance

In regulated industries such as banking, insurance, healthcare, and critical infrastructure, the registry is often the artefact that auditors examine. Required documentation typically includes intended use, training data provenance, performance metrics across demographic slices, fairness tests, robustness tests, monitoring plans, and the responsible owners. Standards and guidance such as the EU AI Act, NIST AI Risk Management Framework, ISO/IEC 42001 AI management system standard, and sectoral regulator guidance increasingly assume that organisations can produce this information on demand, which in practice means producing it from the registry.

See Also

References

References

  1. Databricks. MLflow Model Registry Documentation.
  2. National Institute of Standards and Technology. (2023). AI Risk Management Framework (AI RMF 1.0).
  3. ISO/IEC 42001:2023. Information Technology — Artificial Intelligence — Management System.
  4. Bank Negara Malaysia. (2024). Discussion Paper on the Use of AI by Financial Institutions.
  5. Sculley, D. et al. (2015). Hidden Technical Debt in Machine Learning Systems. NeurIPS.