As artificial intelligence and machine learning continue to gain popularity, companies across many industries are investing in the latest and greatest GPU servers to enhance their data science and AI capabilities. With NVIDIA being the leading player in the GPU market, it’s challenging to determine which NVIDIA GPU server is suitable for your company. In this blog post, I will compare the PCIe and SXM5 form factors for NVIDIA H100 GPUs, the highest-performing GPUs currently available, and contrast performance and costs to help you make an informed decision.
NVIDIA H100 GPUs are available in two form factors: PCIe and SXM5. PCIe GPUs can be installed into standard PCIe slots on a motherboard, while SXM5 GPUs require a specialized form factor incompatible with standard PCIe slots. PCIe GPUs are easier to install since they don't require specialized hardware or connectors. They are also easier to replace or upgrade since you can simply remove the GPU, unlike the SXM5, which is attached to the chassis via thermal paste. Thanks to NVSwitch, SXM5 GPUs offer higher performance over PCIe GPUs, making them ideal for multi-node data-intensive workloads.
NVLink is a direct GPU-to-GPU interconnect that scales multi-GPU input/output (IO) within the server and is available in both form factors. Exclusive to SXM5, NVSwitch connects multiple NVLinks to provide all-to-all GPU communication at full NVLink speed within a single node. InfiniBand enables NVSwitch connections to be extended across nodes to create a seamless, high-bandwidth, multi-node GPU cluster - effectively forming a data center-sized GPU, making it possible to solve even the most extensive AI jobs rapidly.
The NVIDIA H100 GPU outperforms its predecessor, the A100, by up to 10x for AI workloads. The SXM5 GPU raises the bar considerably by supporting 80 GB of fast HBM3 memory, delivering over 3 TB/sec of memory bandwidth, effectively a 2x increase over the memory bandwidth of the A100, which launched just two years prior. The PCIe H100 provides 80 GB of fast HBM2e with over 2 TB/sec of memory bandwidth.
In terms of costs, the PCIe H100 GPU server is more affordable compared to the SXM5 form factor. Here’s a breakdown of the H100 servers that Arc Compute has available:
2 x H100 GPUs (4U) - $106,000
4 x H100 GPUs (4U) - $172,000
8 x H100 GPUs (4U) - $279,000
4 x H100 GPUs (5U) - $197,000
8 x H100 GPUs (4U) - $310,000
These prices are for servers featuring 2 x Intel® Xeon® Platinum 8458P Processors, 1,024-2,048 GB of system memory, 2 x 4 TB SSD, and 3-year hardware coverage. All servers are customizable to meet your specific use case.
Choosing the correct NVIDIA H100 GPU server for your company requires careful consideration between performance and cost. The PCIe form factor is easier to install and upgrade, while the SXM5 form factor offers higher performance and NVSwitch/InfiniBand for building optimized multi-node clusters. The NVIDIA H100 GPU server offers unparalleled performance compared to its predecessors, making it the ideal choice for data-intensive workloads. The specific requirements of your company’s workloads will determine the final decision of the form factor that’s suitable for you.