Search Results
3 results for “model evaluation”
Infrastructure
AI Benchmarking
The systematic evaluation of AI systems using standardised datasets, tasks, and metrics to measure capability, compare models, and track progress across research and deployment contexts.
6 min readUpdated June 2026
Companies & Tools
Labelbox
Labelbox is an American AI data labeling and model evaluation platform that enables organisations to annotate training datasets, manage labeling workflows, and curate high-quality data for machine learning development.
5 min readUpdated June 2026
Companies & Tools
Weights and Biases
Weights and Biases (W&B) is a machine learning developer platform for experiment tracking, model versioning, dataset management, and collaborative model evaluation used by over 200,000 ML practitioners worldwide.
5 min readUpdated May 2026