Model Cards
Model cards are structured documentation sheets accompanying machine learning models that disclose intended uses, performance characteristics, training data, limitations, and ethical considerations.
Model cards are concise, structured documents that accompany machine learning models and provide standardised information about a model's development, capabilities, intended applications, evaluated performance, and known limitations. The concept was introduced by Margaret Mitchell, Timnit Gebru, and colleagues at Google Research and formalised in the 2019 paper Model Cards for Model Reporting. Since then, model cards have been adopted across academia, industry, and government as a baseline transparency practice in responsible AI development, and they form a core component of major model-hosting platforms including Hugging Face, TensorFlow Hub, and the NVIDIA NGC catalogue.
Origins and Motivation
The model card concept emerged from the growing recognition that deploying a machine learning model without adequate documentation could cause harm, particularly when models trained on data from one population are applied to a different population, or when high-stakes decisions in areas such as criminal justice, healthcare, and employment are automated without disclosure of the underlying model's limitations. Mitchell and colleagues drew on analogies from pharmaceutical package inserts and nutrition labels — documents that concisely convey the properties, indications, contraindications, and side effects of a product in a standardised format that enables informed use.
The original 2019 paper proposed that model cards should accompany any model released for downstream use, covering the model's intended use cases, the populations and contexts for which it was evaluated, quantitative performance metrics disaggregated by subgroup, and known ethical considerations.
Standard Structure
Hugging Face, which hosts the world's largest repository of open-weight models, has refined the model card into a practical template widely adopted across the ML community. A complete model card typically includes:
Model description covers the model's architecture, training framework, parameter count, and the organisation or individuals who developed it.
Intended uses covers the tasks and contexts the model was designed for, together with explicitly out-of-scope uses that the model has not been evaluated for or should not be applied to.
Training data provides a description of the datasets used, their provenance, size, and any preprocessing applied. Where training data cannot be fully disclosed, an explanation of what can be shared and why should be provided.
Evaluation results provides quantitative metrics on relevant benchmarks, disaggregated where possible by demographic subgroup, domain, or language to surface performance disparities.
Ethical considerations covers potential harms, misuse scenarios, and how the developers have attempted to mitigate them.
Caveats and recommendations provides guidance on conditions under which the model may perform differently from the evaluation results, and recommendations for safe deployment.
Regulatory Relevance
Model cards have transitioned from voluntary best practice to a matter of regulatory expectation in several jurisdictions. The EU AI Act (2024) requires providers of high-risk AI systems to maintain technical documentation that covers model capabilities, limitations, and evaluation methodology — requirements closely aligned with model card content. The US Executive Order on Safe, Secure, and Trustworthy AI (2023) directed federal agencies to develop guidance on AI transparency, referencing model documentation as a key mechanism. In the United Kingdom, the AI Safety Institute has published evaluation methodology that incorporates model card disclosures as part of frontier model assessments.
Variants and Extensions
Several complementary documentation frameworks have emerged alongside model cards. Datasheets for Datasets (Gebru et al., 2018) applies similar structured disclosure to training datasets. System cards extend the concept to AI products composed of multiple models and non-model components. FactSheets from IBM Research emphasise supplier-facing disclosure for enterprise AI procurement. The Model Transparency Index, maintained by Stanford's Center for Research on Foundation Models (CRFM), scores foundation model developers on a range of disclosure dimensions overlapping with model card content.
See Also
References
- Mitchell, M., Wu, S., Zaldivar, A., et al. (2019). Model Cards for Model Reporting. Proceedings of the ACM Conference on Fairness, Accountability, and Transparency.
- Gebru, T., Morgenstern, J., Vecchione, B., et al. (2018). Datasheets for Datasets. arXiv:1803.09010.
- Hugging Face. (2024). Model Card Guidebook. huggingface.co/docs/hub/model-card-guidebook.
- Liang, P., et al. (2023). Holistic Evaluation of Language Models. Transactions on Machine Learning Research.