Google Cloud's new Compute Engine A3 supercomputers are built for the most demanding artificial intelligence (AI) and machine learning (ML) models, combining NVIDIA H100 Tensor Core GPUs with Google's networking advancements.
Compared to the company's A2 VMs, Google claims up to 10x more network bandwidth with low latencies and improved stability. How did it achieve this? The new A3 supercomputers using NVIDIA H100 GPUs are the first to use Google's custom-designed 200 Gbps IPUs with GPU data bypassing the CPU host on a separate interface compared to other VM network and data traffic.
As with all things supercomputing and cutting-edge AI-based, scalability allows tens of thousands of interconnected GPUs to operate with a "workload bandwidth that is indistinguishable from more expensive off-the-shelf non-blocking network fabrics."
Impressive stuff. Here's a breakdown of the features of the new A3 supercomputers from Google.
- 8 H100 GPUs utilizing NVIDIA's Hopper architecture, delivering 3x compute throughput
- 3.6 TB/s bisectional bandwidth between A3's 8 GPUs via NVIDIA NVSwitch and NVLink 4.0
- Next-generation 4th Gen Intel Xeon Scalable processors
- 2 TB of host memory via 4800 MHz DDR5 DIMMs
- 10x greater networking bandwidth powered by our hardware-enabled IPUs, specialized inter-server GPU communication stack, and NCCL optimizations
This announcement also bolsters the partnership between Google and NVIDIA regarding AI and cloud-based computing.
"Google Cloud's A3 VMs, powered by next-generation NVIDIA H100 GPUs, will accelerate training and serving of generative AI applications," said Ian Buck, vice president of hyperscale and high-performance computing at NVIDIA. "On the heels of Google Cloud's recently launched G2 instances, we're proud to continue our work with Google Cloud to help transform enterprises around the world with purpose-built AI infrastructure."