NVIDIA Blackwell AI GPUs up to 2.2x faster than Hopper in MLPerf v4.1 AI training benchmarks

NVIDIA's new Blackwell AI GPUs offer 2.2x the performance than Hopper in new MLPerf v4.1 AI training benchmarks released by the company.

NVIDIA Blackwell AI GPUs up to 2.2x faster than Hopper in MLPerf v4.1 AI training benchmarks
Comment IconFacebook IconX IconReddit Icon
Gaming Editor
Published
1 minute & 30 seconds read time
TL;DR: NVIDIA's new Blackwell AI GPUs outperform Hopper by up to 2.2x in MLPerf v4.1 AI training benchmarks. The Blackwell GPUs set records using the Nyx AI supercomputer, showing significant speed improvements in tasks like Llama 2 70B and GPT-3 175B.

NVIDIA has just published some juicy benchmarks of its new Blackwell AI GPUs in MLPerf v4.1 AI training workloads, where against Hopper the new Blackwell chips are up to 2.2x faster. Check it out:

NVIDIA Blackwell AI GPUs up to 2.2x faster than Hopper in MLPerf v4.1 AI training benchmarks 8088

The new Blackwell AI GPUs have set all 7 per-accelerator records using its Nyx AI supercomputer, which packs DGX B200 systems. The Nyx AI supercomputer is 2.2x faster in Llama 2 70B (Fine-Tuning) versus Hopper H100, 2x faster in GPT-3 175B (Pre-Training) versus Hopper H100, and it also demolished the entire set of workloads inside of the MLPerf Training 4.1 suite.

NVIDIA explains: "The first Blackwell training submission to the MLCommons Consortium - which creates standardized, unbiased and rigorously peer-reviewed testing for industry participants - highlights how the architecture is advancing generative AI training performance. For instance, the architecture includes new kernels that make more efficient use of Tensor Cores. Kernels are optimized, purpose-built math operations like matrix-multiplies that are at the heart of many deep learning algorithms".

"Blackwell's higher per-GPU compute throughput and significantly larger and faster high bandwidth memory allows it to run the GPT-3 175B benchmark on fewer GPUs while achieving excellent per-GPU performance. Taking advantage of higher-bandwidth HBM3e memory, just 64 Blackwell GPUs were run in the GPT-3 LLM benchmark without compromising per-GPU performance. The same benchmark run using Hopper needed 256 GPUs to achieve the same performance".

Photo of the ASUS TUF Gaming GeForce RTX; 4090 OG OC Edition
Best Deals: ASUS TUF Gaming GeForce RTX; 4090 OG OC Edition
Today7 days ago30 days ago
--
$3745 USD$3745 USD
--
$5999.99 CAD$5094 CAD
--
--
Check PriceCheck Price
* Prices last scanned 4/18/2026 at 8:26 am CDT - prices may be inaccurate. As an Amazon Associate, we earn from qualifying purchases. We earn affiliate commission from any Newegg or PCCG sales.
News Source:wccftech.com

Gaming Editor

Email IconX IconLinkedIn Icon

Anthony joined TweakTown in 2010 and has since reviewed 100s of tech products. Anthony is a long time PC enthusiast with a passion of hate for games built around consoles. FPS gaming since the pre-Quake days, where you were insulted if you used a mouse to aim, he has been addicted to gaming and hardware ever since. Working in IT retail for 10 years gave him great experience with custom-built PCs. His addiction to GPU tech is unwavering and has recently taken a keen interest in artificial intelligence (AI) hardware.

Follow TweakTown on Google News
Newsletter Subscription