With one-third of the power consumption and a smaller footprint, Micron has unveiled a memory world first, a monolithic 32Gb LPDDR5X die powering a high-capacity 256GB LPDRAM SOCAMM2 module. And with 2TB of LPDRAM per 8-channel CPU, Micron says you're looking at a 2.3x speed increase in the all-important 'time to first token' metric.

Yes, this impressive memory module is designed to serve the AI and Data Center markets, targeting LLM inference and workloads where memory capacity, bandwidth, efficiency, and latency influence performance and scalability. SOCAMM2 presents an ideal solution, as its smaller footprint and lower power consumption make it more attractive than traditional server memory like RDIMMs.

Micron says that it has been collaborating with NVIDIA in the development of sophisticated memory for AI, which has led to the world's first 256GB LPDRAM SOCAMM2 module. Although GPU VRAM is critical for AI, fast system or server memory is right there, as KV cache offloading moves key/value data from GPU memory to lower-cost solutions like this.

"Micron's 256GB SOCAMM2 offering enables the most power-efficient CPU-attached memory solution for both AI and HPC," said Raj Narasimhan, senior vice president and general manager of Micron's Cloud Memory Business Unit. "Today's announcement highlights Micron's technology and packaging advancements to deliver the highest-capacity, lowest-power modular memory solution with the smallest footprint in the industry. Our continued leadership in low-power memory solutions for data center applications has uniquely positioned us to be the first to deliver a 32Gb monolithic LPDRAM die, helping drive industry adoption of more power-efficient, high-capacity system architectures."