TRENDING: Nintendo Switch 2 release window confirmed by at least six developers

NVIDIA Blackwell AI GPUs up to 2.2x faster than Hopper in MLPerf v4.1 AI training benchmarks

NVIDIA's new Blackwell AI GPUs offer 2.2x the performance than Hopper in new MLPerf v4.1 AI training benchmarks released by the company.

NVIDIA Blackwell AI GPUs up to 2.2x faster than Hopper in MLPerf v4.1 AI training benchmarks
Comment IconFacebook IconX IconReddit Icon
Gaming Editor
Published
1 minute & 30 seconds read time
TL;DR: NVIDIA's new Blackwell AI GPUs outperform Hopper by up to 2.2x in MLPerf v4.1 AI training benchmarks. The Blackwell GPUs set records using the Nyx AI supercomputer, showing significant speed improvements in tasks like Llama 2 70B and GPT-3 175B.

NVIDIA has just published some juicy benchmarks of its new Blackwell AI GPUs in MLPerf v4.1 AI training workloads, where against Hopper the new Blackwell chips are up to 2.2x faster. Check it out:

NVIDIA Blackwell AI GPUs up to 2.2x faster than Hopper in MLPerf v4.1 AI training benchmarks 8088

The new Blackwell AI GPUs have set all 7 per-accelerator records using its Nyx AI supercomputer, which packs DGX B200 systems. The Nyx AI supercomputer is 2.2x faster in Llama 2 70B (Fine-Tuning) versus Hopper H100, 2x faster in GPT-3 175B (Pre-Training) versus Hopper H100, and it also demolished the entire set of workloads inside of the MLPerf Training 4.1 suite.

NVIDIA explains: "The first Blackwell training submission to the MLCommons Consortium - which creates standardized, unbiased and rigorously peer-reviewed testing for industry participants - highlights how the architecture is advancing generative AI training performance. For instance, the architecture includes new kernels that make more efficient use of Tensor Cores. Kernels are optimized, purpose-built math operations like matrix-multiplies that are at the heart of many deep learning algorithms".

"Blackwell's higher per-GPU compute throughput and significantly larger and faster high bandwidth memory allows it to run the GPT-3 175B benchmark on fewer GPUs while achieving excellent per-GPU performance. Taking advantage of higher-bandwidth HBM3e memory, just 64 Blackwell GPUs were run in the GPT-3 LLM benchmark without compromising per-GPU performance. The same benchmark run using Hopper needed 256 GPUs to achieve the same performance".

Photo of the ASUS TUF Gaming GeForce RTX 4090 OG OC Edition
Best Deals: ASUS TUF Gaming GeForce RTX 4090 OG OC Edition
Country flag Today 7 days ago 30 days ago
Loading... Loading...
Buy
* Prices last scanned on 12/6/2024 at 6:54 pm CST - prices may not be accurate, click links above for the latest price. We may earn an affiliate commission from any sales.
NEWS SOURCE:wccftech.com

Gaming Editor

Email IconX IconLinkedIn Icon

Anthony joined the TweakTown team in 2010 and has since reviewed 100s of graphics cards. Anthony is a long time PC enthusiast with a passion of hate for games built around consoles. FPS gaming since the pre-Quake days, where you were insulted if you used a mouse to aim, he has been addicted to gaming and hardware ever since. Working in IT retail for 10 years gave him great experience with custom-built PCs. His addiction to GPU tech is unwavering and has recently taken a keen interest in artificial intelligence (AI) hardware.

Related Topics

Newsletter Subscription