NVIDIA Hopper H100 GPU detailed: TSMC 4nm, HBM3 VRAM, 80B transistors

NVIDIA's next-gen Hopper H100 GPU detailed at Hot Chips 34: TSMC 4nm process node, 80 billions transistors, PCIe 5.0, world's first HBM3 memory.

Anthony Garreffa

Gaming Editor

Published Aug 23, 2022 1:14 AM CDT
Updated Sep 15, 2022 12:53 AM CDT

4 minutes read time

NVIDIA has detailed its next-gen Hopper H100 GPU at the Hot Chips 34 event, with the new Hopper H100 GPU being made on TSMC's fresh 4nm process node.

NVIDIA Hopper H100 GPU detailed: TSMC 4nm, HBM3 VRAM, 80B transistors 02

VIEW GALLERY - 10 IMAGES

TSMC's new 4N process node was optimized and designed exclusively for NVIDIA and its new Hopper H100 GPU, where NVIDIA flexes its silicon muscle with its 4th Gen Tensor Core architecture, the world's first use of HBM3 memory, and so much more.

The new NVIDIA Hopper H100 GPU is an absolute behemoth: made on TSMC's new 4N process node exclusively for NVIDIA, with 80 billion transistors, and the world's first use of HBM3 memory technology. NVIDIA has built the Hopper H100 onto the PG520 PCB board, which packs 30+ power VRMs and a gigantic integral interposer that uses TSMC's new CoWoS technology: this combines the NVIDIA Hopper H100 GPU with a 6-stack HBM3 design for a very potent mix of GPU + VRAM dominance.

NVIDIA Hopper H100 GPU detailed: TSMC 4nm, HBM3 VRAM, 80B transistors 03

NVIDIA Hopper H100 GPU detailed: TSMC 4nm, HBM3 VRAM, 80B transistors 04

NVIDIA Hopper H100 GPU detailed: TSMC 4nm, HBM3 VRAM, 80B transistors 05

NVIDIA Hopper H100 GPU detailed: TSMC 4nm, HBM3 VRAM, 80B transistors 06

The world's first HBM3 memory is placed onto NVIDIA's new Hopper H100, with 80GB of HBM3 memory and an insane 3.2TB/sec of memory bandwidth. As for the Hopper H100, NVIDIA has a gigantic 144 SM chip layout that has 8 x GPCs in total. Each of the GPCs feature 9 x TCPs that each have 2 x SM units each. This provides the Hopper H100 GPU with 18 x SMs per GPC and 144 on the complete 8 x GPC confirmation. The individual SMs have up to 128 x FP32 units, for a grand total of 18,432 CUDA cores.

Read more: NVIDIA's next-gen Hopper GPU will be detailed at Hot Chips next week
Read more: NVIDIA H100 SXM in the flesh: 4nm Hopper GPU + 80GB HBM3 memory
Read more: You can buy NVIDIA's next-gen Hopper GPU + 80GB HBM2e for $36,550
Read more: NVIDIA is turning data centers into 'AI factories' with Hopper GPU

NVIDIA has different configurations of its Hopper H100 chip, where there'll be the GH100 GPU, and the H100 GPU with SXM5 board form-factor. The difference between the two is below.

VIEW GALLERY - 10 IMAGES

The full implementation of the GH100 GPU includes the following units:

8 GPCs, 72 TPCs (9 TPCs/GPC), 2 SMs/TPC, 144 SMs per full GPU
128 FP32 CUDA Cores per SM, 18432 FP32 CUDA Cores per full GPU
4 Fourth-Generation Tensor Cores per SM, 576 per full GPU
6 HBM3 or HBM2e stacks, 12 512-bit Memory Controllers
60 MB L2 Cache
Fourth-Generation NVLink and PCIe Gen 5

The NVIDIA H100 GPU with SXM5 board form-factor includes the following units:

8 GPCs, 66 TPCs, 2 SMs/TPC, 132 SMs per GPU
128 FP32 CUDA Cores per SM, 16896 FP32 CUDA Cores per GPU
4 Fourth-generation Tensor Cores per SM, 528 per GPU
80 GB HBM3, 5 HBM3 stacks, 10 512-bit Memory Controllers
50 MB L2 Cache
Fourth-Generation NVLink and PCIe Gen 5

NVIDIA is cramming quite a lot into Hopper H100 GPU, with some big gains over the Ampere A100 GPU that's in various supercomputers, HPC systems, AI systems, and more. NVIDIA's new Hopper H100 GPU is a huge 2.25x increase over the Ampere A100 GPU, with NVIDIA pushing more FP64, FP16 and Tensor Cores inside of its Hopper H100 GPU that does a lot more (and much faster) heavy lifting.

The company also adds that the 4th Gen Tensor Cores inside of the Hopper H100 GPU have 2x more performance at the same GPU clock speeds, which is impressive to see.