NVIDIA Blackwell AI GPUs up to 2.2x faster than Hopper in MLPerf v4.1 AI training benchmarks

NVIDIA's new Blackwell AI GPUs offer 2.2x the performance than Hopper in new MLPerf v4.1 AI training benchmarks released by the company.

VIEW GALLERY - 10

Anthony Garreffa

Gaming Editor

Published Nov 15, 2024 2:56 AM CST

30-second read time

TL;DR: NVIDIA's new Blackwell AI GPUs outperform Hopper by up to 2.2x in MLPerf v4.1 AI training benchmarks. The Blackwell GPUs set records using the Nyx AI supercomputer, showing significant speed improvements in tasks like Llama 2 70B and GPT-3 175B.

Voice: DefaultSpeed

0:00 / --:--

NVIDIA has just published some juicy benchmarks of its new Blackwell AI GPUs in MLPerf v4.1 AI training workloads, where against Hopper the new Blackwell chips are up to 2.2x faster. Check it out:

NVIDIA Blackwell AI GPUs up to 2.2x faster than Hopper in MLPerf v4.1 AI training benchmarks 8088

VIEW GALLERY - 10 IMAGES

The new Blackwell AI GPUs have set all 7 per-accelerator records using its Nyx AI supercomputer, which packs DGX B200 systems. The Nyx AI supercomputer is 2.2x faster in Llama 2 70B (Fine-Tuning) versus Hopper H100, 2x faster in GPT-3 175B (Pre-Training) versus Hopper H100, and it also demolished the entire set of workloads inside of the MLPerf Training 4.1 suite.

NVIDIA explains: "The first Blackwell training submission to the MLCommons Consortium - which creates standardized, unbiased and rigorously peer-reviewed testing for industry participants - highlights how the architecture is advancing generative AI training performance. For instance, the architecture includes new kernels that make more efficient use of Tensor Cores. Kernels are optimized, purpose-built math operations like matrix-multiplies that are at the heart of many deep learning algorithms".

"Blackwell's higher per-GPU compute throughput and significantly larger and faster high bandwidth memory allows it to run the GPT-3 175B benchmark on fewer GPUs while achieving excellent per-GPU performance. Taking advantage of higher-bandwidth HBM3e memory, just 64 Blackwell GPUs were run in the GPT-3 LLM benchmark without compromising per-GPU performance. The same benchmark run using Hopper needed 256 GPUs to achieve the same performance".

NVIDIA Blackwell AI GPUs up to 2.2x faster than Hopper in MLPerf v4.1 AI training benchmarks

Best Deals: ASUS TUF Gaming GeForce RTX; 4090 OG OC Edition

Comments

Similar News Stories