AIWiki
Malaysia

Scale AI

An American data labelling, evaluation, and AI infrastructure company that supplies training data and evaluation services to leading AI laboratories, autonomous vehicle developers, and government agencies.

5 min readLast updated June 2026Companies & Tools

Scale AI is an American data infrastructure company that provides labelled training data, human evaluation, reinforcement learning from human feedback (RLHF), and AI deployment services to AI laboratories, autonomous vehicle developers, public sector clients, and large enterprises. Founded in 2016 by Alexandr Wang and Lucy Guo, the company is headquartered in San Francisco. Scale AI grew from labelling sensor data for self-driving cars into one of the most visible providers of training-data infrastructure for frontier large language models, and it plays an active role in the broader US AI policy and national security ecosystem.

History

Scale AI was founded in 2016 to provide annotation services for autonomous vehicle development, where labelled camera, LiDAR, and radar data underpins perception model training. Early customers included Cruise, Waymo, Lyft, and various original equipment manufacturers. As the foundation model era unfolded, Scale expanded into instruction-tuning datasets, preference data for RLHF, and human evaluations for large language models, becoming a key supplier to most leading AI laboratories. The company has raised multiple large funding rounds and has been reported to be one of the higher-valued private AI infrastructure firms in the United States. In 2024 and 2025, Scale also expanded into defence-oriented AI offerings, including its Donovan and Thunderforge programmes for the US Department of Defense and allied governments.

Services and platform

Scale's core offering is a labelling and annotation platform that combines a global crowd of contributors, expert specialists for technical domains, machine learning-assisted pre-labelling, quality control, and customer-facing dashboards. The platform supports labelling across images, video, LiDAR point clouds, audio, natural language, and code. For foundation model customers, Scale provides instruction data generation, preference and ranking data for RLHF, red-teaming, model evaluations, and specialised expert data in domains such as mathematics, programming, medicine, and law.

Adjacent products include Scale Nucleus for dataset curation and analysis, Scale Studio for labelling workflow management, and Scale GenAI Platform for enterprise deployment of generative AI. The defence-oriented Donovan platform integrates LLMs with classified and unclassified intelligence data sources for military analytics and decision support, while Thunderforge supports campaign planning for US Indo-Pacific Command and US European Command.

Role in frontier model development

Scale is widely cited as a major supplier of human feedback data for the post-training of frontier large language models, including instruction tuning and RLHF stages of model development. Public reporting and academic literature have identified Scale among the providers of preference data used to align successive generations of leading proprietary and open models. The company also publishes leaderboards and evaluations for foundation models — including reasoning, agentic, and safety benchmarks — and contributes to the academic discussion of evaluation methodology.

Controversies and policy

Scale's role in data labelling has attracted scrutiny over contributor pay, working conditions, and the broader question of how human labour is structured in the AI supply chain, particularly for workers in developing countries. The company has also been a visible participant in US AI policy discussions, with its founder testifying before Congress on AI competitiveness and national security. Scale's expanding defence portfolio and its participation in US export-control and AI-safety initiatives have positioned it at the intersection of commercial AI and national security policy.

Competitive position

Scale operates alongside other annotation and evaluation providers including Labelbox, Snorkel AI, Surge AI, and Appen, as well as in-house data operations at major AI laboratories. It is distinguished by its scale of operations, breadth of modality coverage, depth of expert-domain data, and prominent positioning with frontier model labs and the US government. The increasing demand for high-quality reasoning, mathematical, and coding data in post-training has driven growth in expert-data services across the industry, where Scale has invested heavily.

See Also

References

References

  1. Scale AI, Inc. (2026). Company Overview and Product Documentation. scale.com.
  2. Wang, A. (2024). Testimony before the US Senate on AI Competitiveness. Congressional record.
  3. US Department of Defense. (2025). Thunderforge and Donovan Programme Announcements.
  4. Ouyang, L., et al. (2022). Training Language Models to Follow Instructions with Human Feedback. OpenAI / NeurIPS.