NVIDIA reveals next-gen Hopper GPU architecture, H100 GPU announced

NVIDIA's next-gen Hopper GPU architecture launched, powers the new H100 GPU with 80 billion transistors -- the best for AI and HPC.

Published
Updated
5 minutes & 15 seconds read time

The day is finally here: NVIDIA has unleashed its next-gen Hopper GPU architecture. NVIDIA announced the first Hopper-based GPU -- the new NVIDIA H100 GPU -- at its GPU Technology Conference (GTC 2022).

NVIDIA reveals next-gen Hopper GPU architecture, H100 GPU announced 02

NVIDIA's new Hopper GPU architecture was named after pioneering US computer scientist, Grace Hopper, and succeeds the Ampere GPU architecture. The new NVIDIA H100 GPU has 80 billion transistors, and now becomes the world's largest and most powerful accelerator.

NVIDIA and TSMC (Taiwan Semiconductor Manufacturing Company) worked on the TSMC 4N process node, tweaking it for the Hopper-based H100 GPU. There's PCIe 5.0 support, next-gen ultra-fast HBM3 memory technology with an absolute blistering 3TB/sec (3000MB/sec). Insanity.

The skinny on NVIDIA's next-gen Hopper GPU:

  • TSMC 4N process node: The new Hopper GPU architecture is the first to be built on TSMC's cutting-edge 4N process node, which NVIDIA says was "designed for NVIDIA's accelerated compute needs". The new H100 GPU architecture "features major advances to accelerate AI, HPC, memory bandwidth, interconnect and communication, including nearly 5 terabytes per second of external connectivity. H100 is the first GPU to support PCIe Gen5 and the first to utilize HBM3, enabling 3TB/s of memory bandwidth".
  • 80 freaking billion transistors: There's a huge 80 billion transistors on the NVIDIA H100 GPU, comparing that to the 54 billion transistors of the Ampere-based NVIDIA A100 GPU... or the 21 billion transistors of the Volta-based NVIDIA Tesla V100... or the 15 billion transistors on the Pascal-based NVIDIA Tesla P100 GPU. NVIDIA has come a long way, getting more transistors onto a continuously shrinking chip.
  • 80 freaking billion transistors: NVIDIA NVLink technology goes 4th-Gen: The new Hopper-based NVIDIA H100 GPU has the very latest 4th-Generation NVLink technology, which super-speeds the largest AI models in the world. NVLink can be combined with a new external NVLink Switch, which extends NVLink as a scale-up network beyond the server. You'll be able to connect an insane 256 x H100 GPUs at 9x higher bandwidth versus the previous generation using NVIDIA HDR Quantum InfiniBand.
NVIDIA reveals next-gen Hopper GPU architecture, H100 GPU announced 04
  • Move over Michael Bay, what the hell is that Transformer Engine: NVIDIA's new Hopper GPU architecture also features a new "Transformer Engine", which the company says is the "standard model choice for natural language processing, the Transformer is one of the most important deep learning models ever invented. The H100 accelerator's Transformer Engine is built to speed up these networks as much as 6x versus the previous generation without losing accuracy".
  • CIA Who? NVIDIA H100 has Confidential Computing tech: Alright, something else new is the Confidential Computing side of the H100... which NVIDIA says the H100 is the worl's first accelerator with confidential cpmuting capabilities that can "protect AI models and customer data while they are being processed. Customers can also apply confidential computing to federated learning for privacy-sensitive industries like healthcare and financial services, as well as on shared cloud infrastructures".

NVIDIA founder and CEO Jensen Huang said: "Data centers are becoming AI factories -- processing and defining mountains of data to produce intelligence. NVIDIA H100 is the engine of the world's AI infrastructure that enterprises use to accelerate their AI-driven business".

NVIDIA reveals next-gen Hopper GPU architecture, H100 GPU announced 05
NVIDIA reveals next-gen Hopper GPU architecture, H100 GPU announced 03

NVIDIA H100 Technology Breakthroughs

The NVIDIA H100 GPU sets a new standard in accelerating large-scale AI and HPC, delivering six breakthrough innovations:

  • World's Most Advanced Chip - Built with 80 billion transistors using a cutting-edge TSMC 4N process designed for NVIDIA's accelerated compute needs, H100 features major advances to accelerate AI, HPC, memory bandwidth, interconnect and communication, including nearly 5 terabytes per second of external connectivity. H100 is the first GPU to support PCIe Gen5 and the first to utilize HBM3, enabling 3TB/s of memory bandwidth. Twenty H100 GPUs can sustain the equivalent of the entire world's internet traffic, making it possible for customers to deliver advanced recommender systems and large language models running inference on data in real-time.
  • New Transformer Engine - Now the standard model choice for natural language processing, the Transformer is one of the most important deep learning models ever invented. The H100 accelerator's Transformer Engine is built to speed up these networks as much as 6x versus the previous generation without losing accuracy.
  • 2nd-Generation Secure Multi-Instance GPU - MIG technology allows a single GPU to be partitioned into seven smaller, fully isolated instances to handle different types of jobs. The Hopper architecture extends MIG capabilities by up to 7x over the previous generation by offering secure multitenant configurations in cloud environments across each GPU instance.
  • Confidential Computing - H100 is the world's first accelerator with confidential computing capabilities to protect AI models and customer data while they are being processed. Customers can also apply confidential computing to federated learning for privacy-sensitive industries like healthcare and financial services, as well as on shared cloud infrastructures.
  • 4th-Generation NVIDIA NVLink - To accelerate the largest AI models, NVLink combines with a new external NVLink Switch to extend NVLink as a scale-up network beyond the server, connecting up to 256 H100 GPUs at 9x higher bandwidth versus the previous generation using NVIDIA HDR Quantum InfiniBand.
  • DPX Instructions - New DPX instructions accelerate dynamic programming - used in a broad range of algorithms, including route optimization and genomics - by up to 40x compared with CPUs and up to 7x compared with previous-generation GPUs. This includes the Floyd-Warshall algorithm to find optimal routes for autonomous robot fleets in dynamic warehouse environments, and the Smith-Waterman algorithm used in sequence alignment for DNA and protein classification and folding.
Buy at Amazon

GIGABYTE GeForce RTX 3080 Gaming OC 12G

TodayYesterday7 days ago30 days ago
$770.50$889.21$529.00
* Prices last scanned on 3/28/2024 at 5:06 am CDT - prices may not be accurate, click links above for the latest price. We may earn an affiliate commission.

Anthony joined the TweakTown team in 2010 and has since reviewed 100s of graphics cards. Anthony is a long time PC enthusiast with a passion of hate for games built around consoles. FPS gaming since the pre-Quake days, where you were insulted if you used a mouse to aim, he has been addicted to gaming and hardware ever since. Working in IT retail for 10 years gave him great experience with custom-built PCs. His addiction to GPU tech is unwavering and has recently taken a keen interest in artificial intelligence (AI) hardware.

Newsletter Subscription

Related Tags