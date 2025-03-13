Apple's new M3 Ultra processor slices and dices DeepSeek R1 models: uses 448GB of unified RAM, only 200W of power... no multi-GPU setup necessary.

Apple launched its new M3 Ultra processor inside of its new Mac Studio earlier this week, packing a 32-core CPU and 80-core GPU and boy-oh-boy does it perform well with DeepSeek R1 models.

In a new video from YouTuber Dave2D, who has compared Apple's new M3 Ultra against the M2 Ultra, M4 Max, and M3 Air processors, the M3 Ultra absolutely destroys DeepSeek R1 performance. The YouTuber uses DeepSeek R1 and i ts 7B, 14B, and 32B parameter models first, which the M3 Ultra takes the crown (by far) in all tests.

Apple's new M3 Ultra (which is pretty much just 2 x M3 Max chips acting as one) has up to 512GB of unified RAM support, compared to the 128GB on M2 Ultra + M4 Max, and just 24GB on the M3 Air. This leads the Apple M3 Ultra to pump out 89.07 T/s on DeepSeek R1 with 7B parameters, compared to 54.22 T/s on the M2 Ultra, 53.18 T/s on the M4 Max, and just 17.7 T/s on the M3 Air.

Moving onto DeepSeek R1 with 14B parameters, the M3 Ultra destroys with 49.14 T/s, M2 Ultra with 28.72 T/s, M4 Max with 22.43 T/s, and the M3 Air with just 8.55 T/s.

Now... let's move to the larger 70 billion parameter model, with the M3 Ultra-powered Mac Studio performing admirably here: 13.68 T/s on the M3 Ultra compared to 7.99 T/s on M2 Ultra and 8.78 T/s on M4 Max. You can see from the charts that once things get real serious with DeepSeek R1 running its gigantic 671 billion paramer model that its 512GB of unified RAM is the key here -- 16.08 T/s on the M3 Ultra -- versus... nothing on the M2 Ultra and M4 Max as their 128GB of unified RAM simply isn't enough to run the 671B parameter R1 model.

What makes this so impressive? Apple's new M3 Ultra has 512GB of unified RAM, with the entire M3 Ultra-powered Mac Studio using under 200W of power... you'd need a full-on multi-GPU setup that would require 10x (or more) power than the sub 200W that the M3 Ultra is using to run DeepSeek R1.

Apple's new M3 Ultra has some architectural efficiencies that also see it running the DeepSeek R1 model with its 671 billion parameters BETTER than the 70 billion parameter version, showing Apple's silicon prowess. It's an impressive thing to see, in a world dominated by huge, power-hungry AI GPUs, that's for sure.

Sub 200W of power also won't run hot, so you're not sitting in a toasty room running high-end AI hardware that gets hot after having AI models with billions of parameters being pumped into it. Impressive to see Apple... very impressive.