DeepSeek's next-gen R2 AI model rumors: 97% lower costs than GPT-4, trained on Huawei AI chips

Chinese AI giant DeepSeek's new R2 AI model teased: 97% lower costs than GPT-4 with the new AI model fully trained on Huawei AI GPUs.

Anthony Garreffa

Gaming Editor

Published Apr 27, 2025 10:08 AM CDT

1 minute & 15 seconds read time

TL;DR: Chinese AI firm DeepSeek is developing its next-gen R2 model with 1.2 trillion parameters, using a hybrid MoE architecture for optimized AI workloads. Trained on Huawei Ascend 910B GPUs, R2 is 97% cheaper to train than GPT-4, offering cost-efficient, high-performance AI for enterprise applications.

Voice: DefaultSpeed

0:00 / --:--

Chinese AI firm DeepSeek is cooking up its next-gen R2 AI model, which is said to be 97% cheaper to train than GPT-4, and it has been fully trained on Huawei AI GPUs.

In a new post on X by @deedydas has the hype train for DeepSeek R2 rocking and rolling, claiming that the new R2 model is going to adopt a hybrid MoE (Mixture of Experts) architecture, which is meant to be an advanced version of the existing MoE implementation, which should provide advanced gating mechanisms, or a combination of MoE + dense layers to optimize high-end AAI workloads.

DeepSeek R2 is set to double the parameters of R1, with 1.2 trillion parameters at the ready, and it's reportedly a whopping 97.3% cheaper to train than GPT 4o with the unit cost per token lower than 97.3% compared to GPT-4 at $0.07/M input token and 0.27/M output token. This means DeepSeek R2 is going to be uber-cheap for enterprise use, as it'll be the most cost-efficient AI model on the market.

Not only that, but DeepSeek R2 is said to achieve 82% utilization of Huawei's Ascend 910B AI chip cluster, with computing power measured at 512 PetaFLOPS of FP16 precision, showing that DeepSeek is using in-house resources for its new mainstream R2 AI model. Huawei AI chips were being trained on R2 with in-house equipment, as the Chinese firm had "vertically integrated" the AI supply chain into its model.