AIWiki
Malaysia

NVIDIA Blackwell

NVIDIA Blackwell is a GPU architecture introduced in 2024 for AI training and inference, featuring dual-die GPUs, FP4 precision, and the GB200 Grace Blackwell Superchip and NVL72 rack-scale systems for trillion-parameter models.

5 min readLast updated June 2026Infrastructure

NVIDIA Blackwell is a graphics processing unit (GPU) architecture developed by NVIDIA and announced in 2024, designed to train and serve the largest artificial intelligence models. Named after the mathematician David Blackwell, the architecture succeeds NVIDIA's Hopper generation and targets the demands of generative AI and large language models, where model sizes reaching trillions of parameters require enormous compute, memory, and interconnect bandwidth. Blackwell-based systems became the dominant high-end AI hardware deployed in data centres through 2025 and 2026.

Architecture Highlights

The flagship Blackwell GPU, the B200, uses a dual-die design housing approximately 208 billion transistors, far exceeding the single-die designs of previous generations. The two dies are connected by a high-bandwidth link so they operate effectively as one GPU. Each B200 includes high-bandwidth HBM3e memory — on the order of 180 to 192 gigabytes depending on configuration — with around 8 terabytes per second of memory bandwidth, addressing the memory capacity and speed bottlenecks that constrain large-model training and inference.

A defining feature of Blackwell is support for lower-precision numerical formats, notably FP4 (four-bit floating point) in its fifth-generation Tensor Cores. Reduced precision allows far higher throughput and lower energy per operation, which is especially valuable for inference. NVIDIA reported that Blackwell delivers roughly four times the training throughput and substantially greater inference performance compared with the prior Hopper H100 generation on transformer workloads, though exact gains depend on the task and precision used.

The Grace Blackwell Superchip

Blackwell is frequently deployed as part of the GB200 Grace Blackwell Superchip, which combines two B200 GPUs with an NVIDIA Grace CPU over a high-bandwidth NVLink chip-to-chip interconnect, reported at around 900 gigabytes per second. Pairing CPU and GPU tightly in one module reduces data-movement bottlenecks between processors and provides large, coherent memory accessible to the accelerators — an arrangement well suited to massive models and their associated data.

Rack-Scale Systems

To train and serve trillion-parameter models, individual GPUs must be linked into much larger systems. The GB200 NVL72 connects 36 Grace CPUs and 72 Blackwell GPUs in a single liquid-cooled rack, joined by fifth-generation NVLink so that the 72 GPUs function as one very large accelerator with a unified high-speed memory domain. NVIDIA describes this configuration as delivering on the order of an exaflop-scale of AI performance and tens of terabytes of fast memory, with large gains in real-time inference for the biggest language models. Liquid cooling is essential because the density of these racks produces heat that air cooling cannot efficiently remove. The NVL72 serves as a building block for larger DGX SuperPOD supercomputers.

NVIDIA subsequently introduced Blackwell Ultra products (such as the B300 generation) with increased memory and performance, continuing the architecture's roadmap.

Significance

Blackwell hardware sits at the centre of the global build-out of AI infrastructure. Its combination of large memory, low-precision throughput, tight CPU-GPU integration, and rack-scale interconnect made it the platform of choice for hyperscalers and AI labs training frontier models. Demand for Blackwell systems has shaped supply chains, data centre design (driving adoption of liquid cooling), and national AI strategies that depend on access to advanced accelerators.

References

  1. NVIDIA. (2024). NVIDIA Blackwell Platform Arrives to Power a New Era of Computing. nvidianews.nvidia.com.
  2. NVIDIA. (2025). GB200 NVL72. nvidia.com/en-us/data-center/gb200-nvl72.
  3. NVIDIA. (2025). NVIDIA Blackwell Architecture Technical Overview. resources.nvidia.com.
  4. The Edge Malaysia. (2025). YTL Power completes first Nvidia-powered AI data centre in Johor. theedgemalaysia.com.