What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Scale AI

An American data labelling, evaluation, and AI infrastructure company that supplies training data and evaluation services to leading AI laboratories, autonomous vehicle developers, and government agencies.

5 min readLast updated June 2026Companies & Tools

Scale AI is an American data infrastructure company that provides labelled training data, human evaluation, reinforcement learning from human feedback (RLHF), and AI deployment services to AI laboratories, autonomous vehicle developers, public sector clients, and large enterprises. Founded in 2016 by Alexandr Wang and Lucy Guo, the company is headquartered in San Francisco. Scale AI grew from labelling sensor data for self-driving cars into one of the most visible providers of training-data infrastructure for frontier large language models, and it plays an active role in the broader US AI policy and national security ecosystem.

History

Scale AI was founded in 2016 to provide annotation services for autonomous vehicle development, where labelled camera, LiDAR, and radar data underpins perception model training. Early customers included Cruise, Waymo, Lyft, and various original equipment manufacturers. As the foundation model era unfolded, Scale expanded into instruction-tuning datasets, preference data for RLHF, and human evaluations for large language models, becoming a key supplier to most leading AI laboratories. The company has raised multiple large funding rounds and has been reported to be one of the higher-valued private AI infrastructure firms in the United States. In 2024 and 2025, Scale also expanded into defence-oriented AI offerings, including its Donovan and Thunderforge programmes for the US Department of Defense and allied governments.

Services and platform

Scale's core offering is a labelling and annotation platform that combines a global crowd of contributors, expert specialists for technical domains, machine learning-assisted pre-labelling, quality control, and customer-facing dashboards. The platform supports labelling across images, video, LiDAR point clouds, audio, natural language, and code. For foundation model customers, Scale provides instruction data generation, preference and ranking data for RLHF, red-teaming, model evaluations, and specialised expert data in domains such as mathematics, programming, medicine, and law.

Adjacent products include Scale Nucleus for dataset curation and analysis, Scale Studio for labelling workflow management, and Scale GenAI Platform for enterprise deployment of generative AI. The defence-oriented Donovan platform integrates LLMs with classified and unclassified intelligence data sources for military analytics and decision support, while Thunderforge supports campaign planning for US Indo-Pacific Command and US European Command.

Role in frontier model development

Scale is widely cited as a major supplier of human feedback data for the post-training of frontier large language models, including instruction tuning and RLHF stages of model development. Public reporting and academic literature have identified Scale among the providers of preference data used to align successive generations of leading proprietary and open models. The company also publishes leaderboards and evaluations for foundation models — including reasoning, agentic, and safety benchmarks — and contributes to the academic discussion of evaluation methodology.

Controversies and policy

Scale's role in data labelling has attracted scrutiny over contributor pay, working conditions, and the broader question of how human labour is structured in the AI supply chain, particularly for workers in developing countries. The company has also been a visible participant in US AI policy discussions, with its founder testifying before Congress on AI competitiveness and national security. Scale's expanding defence portfolio and its participation in US export-control and AI-safety initiatives have positioned it at the intersection of commercial AI and national security policy.

Competitive position

Scale operates alongside other annotation and evaluation providers including Labelbox, Snorkel AI, Surge AI, and Appen, as well as in-house data operations at major AI laboratories. It is distinguished by its scale of operations, breadth of modality coverage, depth of expert-domain data, and prominent positioning with frontier model labs and the US government. The increasing demand for high-quality reasoning, mathematical, and coding data in post-training has driven growth in expert-data services across the industry, where Scale has invested heavily.

Malaysian Context — Data Labelling and the AI Supply Chain

While Scale AI has historically focused its sales operations on the United States and Western markets, the broader data labelling industry has significant operational footprints in South-East Asia, including the Philippines, Indonesia, Vietnam, and Malaysia. Malaysian business process outsourcing firms, particularly those operating from Cyberjaya, TechCity Kuala Lumpur, Penang, and Iskandar Malaysia, participate in the global annotation and evaluation supply chain either as direct contractors or through partnerships with international platforms. Workforce regulations under the Employment Act 1955 and the Minimum Wages Order govern formal employment relationships, while platform-based contractor arrangements occupy a more complex legal status that is the subject of ongoing labour policy debate.

The Malaysia AI Roadmap, MyDigital Blueprint, and the National AI Office articulate ambitions for Malaysia to participate in the higher-value layers of the AI supply chain, not only labelling but also model development and deployment. MDEC's Digital Hub programme and HRD Corp claimable training subsidies have supported upskilling of Malaysian workers toward roles in data science, prompt engineering, model evaluation, and quality assurance, including expert-data roles that command higher rates than basic annotation.

For Malaysian enterprises and government agencies that build AI systems — including Maybank, CIMB, Petronas, Tenaga Nasional Berhad, Telekom Malaysia, and various ministries — Scale AI is one of several possible suppliers of annotated training data and model evaluations, alongside in-house teams and regional providers. The Personal Data Protection Act 2010 (PDPA) imposes constraints on cross-border processing of personal data, which can influence sourcing decisions when annotation tasks involve sensitive Malaysian customer information. Bank Negara Malaysia's Risk Management in Technology framework and the Securities Commission Malaysia's guidelines for capital market institutions also influence how third-party data and evaluation services are governed.

References

Scale AI, Inc. (2026). Company Overview and Product Documentation. scale.com.
Wang, A. (2024). Testimony before the US Senate on AI Competitiveness. Congressional record.
US Department of Defense. (2025). Thunderforge and Donovan Programme Announcements.
Ouyang, L., et al. (2022). Training Language Models to Follow Instructions with Human Feedback. OpenAI / NeurIPS.

Tags:scale-ai data-labelling evaluation rlhf training-data

Type	Data labelling and AI infrastructure company
Founded	2016
Founders	Alexandr Wang, Lucy Guo
Headquarters	San Francisco, California, USA
Key services	Training data, RLHF, evaluations, defence AI
Related	RLHF, Data Labelling, OpenAI, Anthropic

History

Services and platform

Role in frontier model development

Controversies and policy

Competitive position

See Also

References

References