.avif)
The NVIDIA Vera Rubin NVL72 is the successor to the GB300 NVL72, combining 72 Rubin Ultra GPUs and 36 Vera CPUs into a single, unified rack-scale compute domain. Connected by next-generation NVLink, the entire system operates as one massive AI accelerator designed for the scale and complexity of frontier AI workloads. NVIDIA has stated the platform is designed to deliver significant gains in performance per watt and lower cost per token compared to previous generations, particularly for large-scale and mixture-of-experts models.
.avif)
Arc Compute is preparing to offer Vera Rubin NVL72 solutions from leading OEM partners. System configurations and availability will be announced as Rubin hardware becomes commercially available.
The Vera Rubin NVL72 is a fully integrated,liquid-cooled rack designed to operate as a single unified AI system. Itcombines 72 Rubin Ultra GPUs with 36 Vera CPUs, connected by next-generationNVLink, and is built for the compute demands of frontier AI training, advancedreasoning, and hyperscale inference workloads.

72 next-generation NVIDIA Rubin Ultra GPUs in a single rack, operating as one unified compute domain for frontier-scale AI workloads.
Next-generation NVLink interconnect providing high-bandwidth, low-latency connectivity across all 72 GPUs, enabling the full rack to operate as a single compute node.
36 NVIDIA Vera CPUs purpose-built for tight integration with the GPU fabric, designed to minimize bottlenecks in AI and data-intensive workloads.
Full rack liquid cooling designed for thethermal demands of 72 GPUs at sustained, high-density operation. Built forcontinuous AI factory workloads.
Train frontier-scale foundation models that push beyond what current-generation platforms can efficiently support. The unified 72-GPU domain and next-generation interconnect are designed for the training runs that will define the next era of AI.
Run advanced reasoning systems, agentic AI workflows, and long-context inference at production scale. Designed for the compute and memory demands of complex, multi-step AI reasoning with high throughput and low latency.
Build dedicated AI compute environments designed to run continuously at extreme utilization. The rack-scale architecture and liquid cooling are built for the sustained, dense operation that next-generation AI factory workloads will demand.