Huawei Ascend 910C AI chip cluster dubbed CloudMatrix: beats NVIDIA GB200 NVL72 server in China

Huawei Ascend P910C AI chip cluster called CloudMatrix, will have superior performance to NVIDIA GB200 NVL72 AI servers, but with more power consumption.

VIEW GALLERY - 2

Anthony Garreffa

Gaming Editor

Published Apr 18, 2025 6:33 PM CDT

2-minute read time

TL;DR: Huawei's CloudMatrix 384 AI cluster, powered by 384 Ascend 910C chips, delivers nearly double the BF16 performance and significantly more HBM memory than NVIDIA's GB200 NVL72 AI server. Despite its superior speed and bandwidth, CloudMatrix consumes 3.9 times more power, reflecting China's focus on performance over efficiency.

Voice: DefaultSpeed

0:00 / --:--

China is now using Huawei's Ascend 910C AI chips in a huge cluster called "CloudMatrix" and it reportedly has more performance than NVIDIA's leading GB200 NVL72 AI server, but CloudMatrix uses far more power.

Huawei Ascend 910C AI chip cluster dubbed CloudMatrix: beats NVIDIA GB200 NVL72 server in China 81

VIEW GALLERY - 2 IMAGES

In a new report from SemiAnalysis, we're learning about Huawei's rack scale architecture and just how the Ascend 910C will power China's new CloudMatrix 384 AI cluster, which has performance that rivals NVIDIA's most powerful AI server in the GB200 NVL72 AI server rack.

Huawei's new CloudMatrix 384 "CM384" AI cluster is powered by 384 Huawei Ascend 910C AI chips connected in an "all-to-all topology" configuration. Huawei is outweighing the architectural flaws of its AI chips by using 5x more of them than NVIDIA uses with GB200 inside of NVL72 servers. This is why the company doesn't care about the costs, performance inefficiency, scalability ratios, and more.

One of the new CloudMatrix AI clusters should feature around 300 PetaFLOPS of BF16 performance, close to 2x more powerful than the NVIDIA GB200 NVL72 AI server, and with far more HBM memory on board: CloudMatrix and its 384 Ascend 910C AI GPUs have 49.2TB of HBM compared to just 13.8TB of HBM on GB200 NVL72.

Read more: Huawei Ascend 910B AI chip has secret TSMC chip inside: Taiwan, US government investigating

There's also far more bandwidth on CloudMatrix, with up to 12209TB/sec (1.2 petabytes per second, which is insane) compared to "just" 576TB/sec from NVIDIA GB200 NVL72. We might have far bigger pools of super-fast HBM, but when it comes to performance per watt, across multiple AI workloads the Ascend 910C-powered CloudMatrix AI server uses around 3.9x MORE power than NVIDIA's bleeding-edge GB200 NVL72 AI server... but China doesn't need to worry about power costs or infrastructure, it can just GO.

We don't know how many Huawei Ascend 910C-powered CloudMatrix AI servers will be created, but I would estimate NVIDIA and its partners will make far, far more GB200 NVL72 AI servers than China will make CloudMatrix systems. But, China has a new AI server enforcer on its soil, and that's great to see.

Huawei's current Ascend 910C is a far worse solution than NVIDIA and its B200 and GB200 AI GPUs and servers, respectively, but it's the only homegrown solution that China has access to right now, and this skirts around US export restrictions (and the ability of China only using less powerful H20 AI GPUs from NVIDIA).

The power consumption numbers are scary, with CloudMatrix using 3.9x more power... I can't imagine the cost of running this in a country (like where I am in Australia, and why we aren't seeing AI server infrastructure roll-outs because electricity is so expensive here) where it's 3x or 5x or even 10x more expensive.

Huawei Ascend 910C AI chip cluster dubbed CloudMatrix: beats NVIDIA GB200 NVL72 server in China

Best Deals: NVIDIA A2000 Graphics Card

Comments

Similar News Stories