Facebook, I mean Horizons, I mean LifeLog, I mean Meta have announced that the social networking -- and VR giant -- is putting the finishing touches on the world's fastest AI supercomputer.
The new AI Research SuperCluster (RSC) is already being built and expected to be online by mid-2022, where it is already being used to train huge models in natural language processing (NLP) and computer vision for research -- Meta's aim with RSC -- is that it will one-day train models with trillions and trillions of parameters.
RSC will help super-fuel Meta's AI researchers that will be capable of being fed trillions of examples, where it will use AI to work through hundreds of languages across the planet, seamlessly analyze text, images, and videos together, develop new augmented reality (AR) tools, and so much more.
But man... inside, RSC is a freaking silicon beast.
Meta's first-gen RSC infrastructure was designed in 2017 with 22,000 x NVIDIA V100 Tensor Core GPUs in a single cluster, capable of 35,000 training jobs per day. Meta AI researchers have been using this as a benchmark for performance, reliability, and productivity.
But in early 2020, then-Facebook decided that the "best way to accelerate progress was to design a new computing infrastructure from a clean slate to take advantage of new GPU and network fabric technology. We wanted this infrastructure to be able to train models with more than a trillion parameters on data sets as large as an exabyte - which, to provide a sense of scale, is the equivalent of 36,000 years of high-quality video".
Meta RSC AI Supercomputer specs
- 760 x NVIDIA DGX A100 systems (as compute nodes)
- 6080 x GPUs in total
- 175PB (petabytes) of Pure Storage FlashArray
- 46PB (petabytes) of cache storage in Penguin Computing Altus systems
- 10PB (petabytes) of Pure Storage FlashBlade
(each DGX communicates over NVIDIA Quantum 1600 Gb/s InfiniBand two-level Clos fabric with no oversubscription)
Meta has run some benchmarks on its new RSC -- and no, unfortunately it's not Crysis -- where the new AI supercomputer was compared to Meta's legacy production and research infrastructure. Here, Meta's new RSC was super-speeding computer vision workflows by up to 20x -- running the NVIDIA Collective Communication Library (NCCL) over 9x faster, and RSC also trains large-scale NLP models up to 3x faster.
A model with tens of billions of parameters can now finish in just 3 weeks, compared to 9 weeks with the previous-gen AI supercomputer.
Meta CEO Mark Zuckerberg said: "Meta has developed what we believe is the world's fastest AI supercomputer. We're calling it RSC for AI Research SuperCluster. The experiences we're building for the metaverse require enormous compute power (quintillions of operations / second!) and RSC will enable new AI models that can learn from trillions of examples, understand hundreds of languages, and more. Congrats to the team on building RSC!".
- > NEXT STORY: WarnerMedia re-iterates Gotham Knights, Hogwarts Legacy 2022 launch
- < PREVIOUS STORY: Dying Light 2 will be supported until 2027 with new DLC, expansions