NVIDIA's Rubin Ultra reportedly sticking to a dual-die design instead of a four-die plan

Reports suggest NVIDIA is scaling back its ambitious four-die Rubin Ultra design due to packaging limits, sticking with the dual-die approach.

NVIDIA's Rubin Ultra reportedly sticking to a dual-die design instead of a four-die plan
Comment IconFacebook IconX IconReddit Icon
Tech Reporter
Published
2-minute read time
TL;DR: NVIDIA faces manufacturing challenges with its Rubin Ultra GPU due to packaging limits in its planned four-die design. To resolve warping and thermal issues, it may revert to a dual-die setup with a 2+2 board-level arrangement, maintaining performance and memory capacity while easing production and scalability.

NVIDIA is reportedly running into manufacturing challenges with its next-generation Rubin Ultra GPU and is now considering a significant architectural revision. The standard Rubin, built for large-scale AI training, is on track to begin mass shipments this summer. Even before that rollout, reports about the more advanced Rubin Ultra are starting to surface, and its current roadmap may be hitting some real technological walls.

To understand the problem, it helps to know what Rubin Ultra was originally planned to be. NVIDIA introduced a dual-die architecture with the standard Rubin, meaning two silicon chips packaged together into one unit. Rubin Ultra was meant to take that further with a four-die setup, essentially doubling the base design into a much larger package. However, those ambitions may have pushed TSMC's advanced packaging technology past its practical limits.

NVIDIA's Rubin Ultra reportedly sticking to a dual-die design instead of a four-die plan 2

Reports from Taiwan's Commercial Times suggest that NVIDIA may scale back the Rubin Ultra to a dual-die design, similar to the standard Rubin. The original design reportedly included 16 HBM4 memory stacks, around 1 TB of memory capacity, and CoWoS-L packaging.

In a typical CoWoS package, TSMC combines multiple dies and HBM memory stacks into one unified structure. For Rubin Ultra, NVIDIA planned to use CoWos-L, but scaling it up to a four-die configuration is reportedly causing warping issues, with thermal and structural stresses bending the package in multiple directions. This prevents the compute dies from maintaining proper contact with the underlying substrate.

NVIDIA's Rubin Ultra reportedly sticking to a dual-die design instead of a four-die plan 3

To work around this, NVIDIA is expected to shift back to a dual-die configuration while preserving overall compute performance through board-level design instead. Rather than cramming four dies into a single package, Rubin Ultra would use a 2+2 arrangement spread across a rack-level board. The practical upside is that, on paper, performance, HBM capacity, and compute output would remain the same, but the system would be significantly easier to manufacture and scale in a data center environment.

There is another option on the table. NVIDIA could move toward TSMC's CoPoS packaging, which stands for Chip-on-Panel-on-Substrate, a newer approach designed to support larger AI accelerator designs. The catch is that CoPoS isn't expected to reach mass production until late 2028 at the earliest, making it unlikely to have any impact on Rubin Ultra's 2027 target.

Despite the design revision, Rubin Ultra's final performance specifications are expected to remain unchanged. Questions around thermal management and physical rack-level integration remain open as NVIDIA continues pushing the boundaries of what AI GPU hardware can look like.

Photo of the NVIDIA Tesla V100 Graphics Card
Best Deals: NVIDIA Tesla V100 Graphics Card
Today7 days ago30 days ago
$864.99 USD-
$2199.99 CAD-
$864.99 USD-
$864.99 USD-
Check PriceCheck Price
* Prices last scanned 4/1/2026 at 2:46 pm CDT - prices may be inaccurate. As an Amazon Associate, we earn from qualifying purchases. We earn affiliate commission from any Newegg or PCCG sales.
News Sources:ctee.com.tw and wccftech.com

Tech Reporter

Email IconX IconLinkedIn Icon

Hassam is a veteran tech journalist and editor with over eight years of experience embedded in the consumer electronics industry. His obsession with hardware began with childhood experiments involving semiconductors, a curiosity that evolved into a career dedicated to deconstructing the complex silicon that powers our world. From benchmarking PC internals to stress-testing flagship CPUs and GPUs, Hassam specializes in translating high-level engineering into deep, unbiased insights for the enthusiast community.

Follow TweakTown on Google News
Newsletter Subscription