NVIDIA Ampere A100 specs: 54 billion transistors, 40GB HBM2, 7nm TSMC

NVIDIA makes its next-gen A100 GPU official: the world's largest 7nm processor.

Published
Updated
3 minutes & 18 seconds read time

GTC 2020 NVIDIA has officially unveiled its first GPU based on the Ampere GPU architecture, the new NVIDIA A100, which the company has in full production and is already shipping to customers worldwide.

NVIDIA Ampere A100 specs: 54 billion transistors, 40GB HBM2, 7nm TSMC 02

NVIDIA's new A100 GPU packs an absolutely insane 54 billion transistors (that's 54,000,000,000), 3rd Gen Tensor Cores, 3rd Gen NVLink and NVSwitch, and much more. The GPU itself measures 826mm2 (!!!) on TSMC's 7nm node, as well as packing a huge 40GB of HBM2 memory from Samsung, and up to 600GB/sec bandwidth through NVLink.

The new A100 is being compared against the V100, which is based on the Volta GPU architecture. NVIDIA's current-gen Tesla V100 comes with 16GB and 32GB HBM2 options, while the V100 was built on the 12nm node at TSMC it packs just 21 billion transistors in comparison.

NVIDIA Ampere A100 specs: 54 billion transistors, 40GB HBM2, 7nm TSMC 03

There's some incredible things going on under the Ampere hood, with NVIDIA able to offer the largest leap in performance yet: 20x. The new A100 GPU is 20x faster than the V100, with peak FP32 training of 312 TFLOPs, peak Int8 inference of 1248 TOPs, and FP64 HPC of 19.5 TFLOPs.

NVIDIA Ampere A100 specs: 54 billion transistors, 40GB HBM2, 7nm TSMC 04

NVIDIA founder and CEO Jensen Huang, explains: "The powerful trends of cloud computing and AI are driving a tectonic shift in data center designs so that what was once a sea of CPU-only servers is now GPU-accelerated computing. NVIDIA A100 GPU is a 20x AI performance leap and an end-to-end machine learning accelerator -- from data analytics to training to inference. For the first time, scale-up and scale-out workloads can be accelerated on one platform. NVIDIA A100 will simultaneously boost throughput and drive down the cost of data centers".

NVIDIA explains its new Ampere A100 GPU as a "technical design breakthrough field by five key innovations". These innovations include:

  • Ampere​ ​architecture​: At the heart of A100 is the NVIDIA Ampere GPU architecture, which contains more than 54 billion transistors, making it the world's largest 7-nanometer processor.
  • Third-generation Tensor Cores with TF32​: NVIDIA's widely adopted Tensor Cores are now more flexible, faster and easier to use. Their expanded capabilities include new TF32 for AI​, which allows for up to 20x the AI performance of FP32 precision, without any code changes. In addition, ​Tensor Cores​ now support FP64, delivering up to 2.5x more compute than the previous generation for HPC applications.
  • Multi-instance GPU​: MIG, a new technical feature, enables a single A100 GPU to be partitioned into as many as seven separate GPUs so it can deliver varying degrees of compute for jobs of different sizes, providing optimal utilization and maximizing return on investment.
  • Third-generation NVIDIA NVLink: Doubles the high-speed connectivity between GPUs to provide efficient performance scaling in a server.
  • Structural sparsity: This new efficiency technique harnesses the inherently sparse nature of AI math to double performance.
NVIDIA Ampere A100 specs: 54 billion transistors, 40GB HBM2, 7nm TSMC 02

NVIDIA Ampere A100 specs:

  • Transistors: 54 billion
  • CUDA cores: 6912
  • Double-precision performance: 7.8 TFLOPs
  • Single-precision performance: 15.7 TFLOPs
  • Tensor Performance: 125 TFLOPs
  • Node: 7nm TSMC
  • Memory: 40GB HBM2e
  • Memory bus: 6144-bit
  • Memory bandwidth: 1.6TB/sec
  • Tensor Cores: 1024 3rd Gen
  • Interface: PCIe 4.0 x16
  • TDP: 400W

NVIDIA Volta V100 (SXM2) specs:

  • Transistors: 21.1 billion
  • CUDA cores: 5120
  • Double-precision performance: 7.8 TFLOPs
  • Single-precision performance: 15.7 TFLOPs
  • Tensor Performance: 125 TFLOPs
  • Node: 12nm TSMC
  • Memory: 16/32GB HBM2
  • Memory bus: 4096-bit
  • Memory bandwidth: 900GB/sec
  • Tensor Cores: 640 (1st Gen)
  • Interface: PCIe 3.0 x16
  • TDP: 250-300W
Buy at Amazon

NVIDIA Titan RTX Graphics Card

TodayYesterday7 days ago30 days ago
$1476.59$1484.01$1441.76
* Prices last scanned on 4/26/2024 at 5:59 am CDT - prices may not be accurate, click links above for the latest price. We may earn an affiliate commission.

Anthony joined the TweakTown team in 2010 and has since reviewed 100s of graphics cards. Anthony is a long time PC enthusiast with a passion of hate for games built around consoles. FPS gaming since the pre-Quake days, where you were insulted if you used a mouse to aim, he has been addicted to gaming and hardware ever since. Working in IT retail for 10 years gave him great experience with custom-built PCs. His addiction to GPU tech is unwavering and has recently taken a keen interest in artificial intelligence (AI) hardware.

Newsletter Subscription

Related Tags