What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Cerebras Systems

An American AI hardware company that builds wafer-scale processors, using an entire silicon wafer as a single chip to accelerate deep learning training and to deliver very high-speed large language model inference.

4 min readLast updated July 2026Companies & Tools

Cerebras Systems is an American semiconductor and artificial intelligence company headquartered in Sunnyvale, California, known for building the largest computer chips in the industry. Rather than dividing a silicon wafer into many small chips in the conventional way, Cerebras manufactures a single processor that occupies almost an entire wafer. This wafer-scale approach is designed to accelerate the training of deep learning models and, more recently, to deliver very high-speed inference for large language models.

Wafer-scale engine

The company's flagship product is the Wafer Scale Engine, or WSE. A standard chip fabrication process produces many separate dies from one circular silicon wafer, which are then cut apart and packaged individually. Cerebras instead keeps the wafer intact and treats it as one enormous processor, connecting its many cores with high-bandwidth on-chip links. The advantage is that data can move between cores without leaving the chip, avoiding the slower and more power-hungry communication that limits clusters of separate processors.

The third-generation WSE-3, built on the TSMC 5-nanometre process, integrates about 4 trillion transistors, roughly 900,000 AI-optimised cores, and 44 gigabytes of on-chip memory. It is rated at around 125 petaflops of peak performance and offers extremely high on-chip memory bandwidth. Keeping large amounts of memory on the same piece of silicon as the compute cores is central to the design, because memory bandwidth is often the true bottleneck in AI workloads.

From training to inference

Cerebras originally positioned wafer-scale hardware for training, where the ability to hold a large model on one chip simplifies programming compared with distributing it across many GPUs. More recently the company has emphasised inference, marketing what it describes as some of the fastest large language model serving available. By holding model weights in fast on-chip memory, Cerebras systems can generate output tokens at rates well above typical GPU-based systems. The company has reported serving large open-weight models at thousands of tokens per second per user, figures that in its published benchmarks exceed those of contemporary flagship GPU systems on the same models.

The single-chip Wafer Scale Engine is packaged into a computer system called the CS-3, and multiple CS-3 units can be combined for larger workloads. Cerebras also offers access to its hardware through a cloud inference service, so customers can use the speed advantage without purchasing systems outright.

Position in the market

Cerebras is one of several companies challenging the dominance of conventional GPU clusters in AI computing, competing with specialised inference providers and with the mainstream accelerator market. Its distinctive bet is that radical integration at the wafer level, rather than networking many smaller chips, is the more efficient path for certain AI workloads. The main trade-offs are manufacturing complexity, the cost of a wafer-scale device, and the need for software that maps models onto an unusual architecture. Despite these challenges, the company has attracted significant customers in research, supercomputing and enterprise inference.

Malaysian Context — Hardware Diversity and the Semiconductor Value Chain

Cerebras is relevant to Malaysia on two fronts. First, as a provider of high-speed inference, it is part of a diversifying market for AI compute that Malaysian enterprises and cloud operators can draw on. While domestic facilities such as the YTL AI Cloud are built on Nvidia Grace Blackwell GPUs, awareness of alternative architectures matters for organisations optimising the cost and latency of AI services, including banks such as Maybank and CIMB and telecommunications firms such as Maxis and Telekom Malaysia.

Second, and more strategically, Cerebras depends on advanced semiconductor manufacturing and packaging, an industry in which Malaysia holds a globally important position. Penang and Kulim host a dense cluster of semiconductor assembly, testing and packaging operations, and Malaysia is a major hub for outsourced semiconductor assembly and test services. The National Semiconductor Strategy, announced by the government, aims to move Malaysian firms further up the value chain into advanced packaging and design, the very capabilities that wafer-scale and chiplet-based AI processors require.

For Malaysian talent and companies in the semiconductor corridor around Penang, the rise of specialised AI silicon such as the Wafer Scale Engine represents both a market opportunity in advanced packaging and a reason to invest in the engineering skills that agencies including MIDA, MDEC and HRD Corp are working to develop.

References

Cerebras Systems. (2024). Cerebras Launches the World's Fastest AI Inference. cerebras.ai.
Wikipedia contributors. (2025). Cerebras. en.wikipedia.org.
Cerebras Systems. (2024). WSE-3 Product Overview. cerebras.ai.

Tags:ai hardware semiconductors inference accelerators

Type	AI hardware company
Founded	2015
Headquarters	Sunnyvale, California
Flagship product	Wafer Scale Engine (WSE)
Latest chip	WSE-3, built on TSMC 5nm
Related	GPU cluster, Inference