THE NEXT GENERATION OF RACK-SCALE AI

The NVIDIA Vera Rubin NVL72 is the successor to the GB300 NVL72, combining 72 Rubin Ultra GPUs and 36 Vera CPUs into a single, unified rack-scale compute domain. Connected by next-generation NVLink, the entire system operates as one massive AI accelerator designed for the scale and complexity of frontier AI workloads. NVIDIA has stated the platform is designed to deliver significant gains in performance per watt and lower cost per token compared to previous generations, particularly for large-scale and mixture-of-experts models.

NVIDIA Rubin Architecture

Platform Specs

NVIDIA Vera RUBIN NVL72 RACK-SCALE SYSTEM

The Vera Rubin NVL72 is a fully integrated,liquid-cooled rack designed to operate as a single unified AI system. Itcombines 72 Rubin Ultra GPUs with 36 Vera CPUs, connected by next-generationNVLink, and is built for the compute demands of frontier AI training, advancedreasoning, and hyperscale inference workloads.

Compute

72 RUBIN ULTRA GPUS

72 next-generation NVIDIA Rubin Ultra GPUs in a single rack, operating as one unified compute domain for frontier-scale AI workloads.

Interconnect

NEXT-GEN NVLINK

Next-generation NVLink interconnect providing high-bandwidth, low-latency connectivity across all 72 GPUs, enabling the full rack to operate as a single compute node.

Processors

36 VERA CPUs

36 NVIDIA Vera CPUs purpose-built for tight integration with the GPU fabric, designed to minimize bottlenecks in AI and data-intensive workloads.

Reliability

LIQUID-COOLED DESIGN

Full rack liquid cooling designed for thethermal demands of 72 GPUs at sustained, high-density operation. Built forcontinuous AI factory workloads.

Use Cases

Target workloads

Training

Next-Generation Model Training

Train frontier-scale foundation models that push beyond what current-generation platforms can efficiently support. The unified 72-GPU domain and next-generation interconnect are designed for the training runs that will define the next era of AI.

Agentic Ai

AI Reasoning & Agentic Workloads

Run advanced reasoning systems, agentic AI workflows, and long-context inference at production scale. Designed for the compute and memory demands of complex, multi-step AI reasoning with high throughput and low latency.

AI Factory

Hyperscale AI Factory Infrastructure

Build dedicated AI compute environments designed to run continuously at extreme utilization. The rack-scale architecture and liquid cooling are built for the sustained, dense operation that next-generation AI factory workloads will demand.

NVIDIA VERA RUBIN NVL72 SYSTEMS

THE NEXT GENERATION OF RACK-SCALE AI