Meta's next-gen in-house AI chip is made on TSMC's 5nm process, with LPDDR5 RAM, not HBM

Meta's next-gen MTIA AI processor is made on TSMC 5nm, up to 1.35GHz frequency, PCIe Gen5 x8 interface, to fight NVIDIA in the cloud business.

VIEW GALLERY - 3

Anthony Garreffa

Gaming Editor

Published Apr 10, 2024 10:32 PM CDT
Updated Apr 23, 2024 9:54 AM CDT

1 minute & 30 seconds read time

Meta has just teased its next-gen AI chip -- MTIA -- which is an upgrade over its current MTIA v1 chip. The new MTIA chip is made on TSMC's newer 5nm process node, with the original MTIA chip made on 7nm.

Meta's next-gen in-house AI chip is made on TSMC's 5nm process, with LPDDR5 RAM, not HBM 602

VIEW GALLERY - 3 IMAGES

The new Meta Training and Inference Accelerator (MTIA) chip is "fundamentally focused on providing the right balance of compute, memory bandwidth, and memory capacity" that will be used for the unique requirements of Meta. We've seen the best AI GPUs on the planet using HBM memory -- with HBM3 used on NVIDIA's Hopper H100 and AMD Instinct MI300 series AI chips -- with Meta using low-power DRAM memory (LPDDR5) instead of server DRAM or LPDDR5 memory.

The social networking giant created its MTIA chip was the company's first-generation AI inference accelerator that the company designed in-house for Meta's AI workload in mind. The company says that their deep learning recommendation models are "improving a variety of experiences across our products".

Meta's long-term goal and its AI inference processor journey are to provide the most efficient architecture for Meta's unique workloads. The company adds that as AI workloads become increasingly important for Meta's products and services, the efficiency of its MTIA chips will improve its ability to provide the best experiences for its users across the planet.

Meta's next-gen in-house AI chip is made on TSMC's 5nm process, with LPDDR5 RAM, not HBM 601

Meta explains on its website for MTIA: "This chip's architecture is fundamentally focused on providing the right balance of compute, memory bandwidth, and memory capacity for serving ranking and recommendation models. In inference we need to be able to provide relatively high utilization, even when our batch sizes are relatively low. By focusing on providing outsized SRAM capacity, relative to typical GPUs, we can provide high utilization in cases where batch sizes are limited and provide enough compute when we experience larger amounts of potential concurrent work".

"This accelerator consists of an 8x8 grid of processing elements (PEs). These PEs provide significantly increased dense compute performance (3.5x over MTIA v1) and sparse compute performance (7x improvement). This comes partly from improvements in the architecture associated with pipelining of sparse compute. It also comes from how we feed the PE grid: We have tripled the size of the local PE storage, doubled the on-chip SRAM and increased its bandwidth by 3.5X, and doubled the capacity of LPDDR5".

	Today	7 days ago	30 days ago
	$409 USD	$399.99 USD	$399.99 USD	Buy
	$409.99 USD	$409.99 USD	$409.99 USD	Buy
* Prices last scanned on 10/24/2024 at 10:10 am CDT - prices may not be accurate, click links above for the latest price. We may earn an affiliate commission from any sales.

Today

7 days ago

30 days ago

$409 USD

$399.99 USD

Buy

$409.99 USD

Buy

* Prices last scanned on 10/24/2024 at 10:10 am CDT - prices may not be accurate, click links above for the latest price. We may earn an affiliate commission from any sales.