Blackwell Pod Brings Exascale to a Rack
Author: Anand Joshi
Nvidia’s next-generation pod, the DGX GB200 NVL72, announced at the 2024 GTC Summit, targets large AI model training. The pod houses eighteen 1U form-factor servers, each with four Blackwell GPUs and two Grace CPUs. The rack offers 1.44 ExaFLOPS in a lesser-known FP4 format and 720 PFLOPS at FP8.
The servers connect via a fifth-generation NVLink capable of supporting up to 1.8 terabytes per second (TBps) connectivity. Eighteen servers connect to two NVLink switches on the rack. The NVLink switches connect to the InfiniBand top-of-rack (ToR) switch that then connects to the next rack. The NVLink connection uses copper wire technology instead of more expensive and power-hungry optical connections.
The pod houses 36 CPUs and 72 GPUs and consumes a whopping 120 kW, an increase of 5 times from the previous generation. However, the company has not released performance numbers for FP32—the most popular data type for training or INT8 and the most popular data format for inference. The system is currently being tested, and while the company did not provide the date for general availability, we expect it to be in early 2025.