AMD's new Ryzen AI Max+ 395 "Strix Halo" APU has been tested in DeepSeek R1 with some impressive results, with the AI benchmark running up to 3x faster than NVIDIA's discrete GeForce RTX 5080 graphics card.

First off, the major difference between the two is that the RTX 5080 has 16GB of VRAM and the Strix Halo APU has up to 128GB of VRAM at its disposal, as well as combining a 16-core, 32-thread Zen 5-based processor with 50 TOPS of AI workload performance with its XDNA 2-based NPU.
AMD ran some benchmarks using consumer AI workloads including llama.cpp-powered application, LM Studio, which is shaping up to be the ultimate support for client LLM workloads, allowing users to locally run the latest language model without any technical knowledge required.


The second that the LLM uses more than 16GB of VRAM, that's when the silicon prowess of the Ryzen AI Max+ 395 "Strix Halo" APU and its 128GB of VRAM comes into play: up to 3.05x the performance compared to a discrete GPU that has less than 16GB of VRAM. NVIDIA's more expensive GeForce RTX 5090 and its 32GB of VRAM doesn't hold a candle to the far larger pool of 128GB VRAM available on the Strix Halo APU.
- Read more: Apple M3 Ultra runs DeepSeek R1 with 671B parameters: 448GB of RAM
- Read more: AMD 'Strix Halo' APU Mini-PC tested: up to 140W power, 128GB of RAM
- Read more: GMKtec's new Mini-PC with AMD Ryzen AI Max+ 395 'Strix Halo' APU: 1440p 60FPS+ gaming
Key AMD Ryzen AI MAX+ 395 Advantages Over Copilot+ Competitors:
Performance
- Up to 2.2x better token throughput compared to Intel Arc 140V
- Up to 4x faster in time to first token for smaller models like Llama 3.2 3b Instruct
- Up to 9.1x faster for 7-8B parameter models
- Up to 12.2x faster than Intel Core Ultra 258V for 14B parameter models
Vision Model Performance
- Up to 7x faster in IBM Granite Vision 3.2 3b
- Up to 4.6x faster in Google Gemma 3 4b
- Up to 6x faster in Google Gemma 3 12b
Memory
- Offers up to 128GB unified memory vs. Competition's 32GB max
- Can convert up to 96GB to VRAM through Variable Graphics Memory
- Run larger models like Google Gemma 3 27B Vision that other APUs cannot handle