At NVIDIA GTC 2026, the company has expanded its server lineup with the new efficient single-slot RTX PRO 4500 Blackwell Server Edition GPU. NVIDIA describes it as an "energy-efficient multi-workload accelerator designed to deliver breakthrough performance across a broad range of enterprise workloads," which, of course, includes AI inference as well as other high-end computing tasks, such as video processing.

With 10,496 CUDA Cores, which is slightly lower than the number featured in the gaming-class GeForce RTX 5080 GPU featuring the same GB203 chip, it's also packed with 32GB of GDDR7 memory on a 256-bit interface to deliver memory bandwidth of 800 GB/s. Power consumption-wise, it lives up to its efficient label with the single-slot passively cooled GPU drawing up to 165W.
Naturally, the single-slot form factor is designed for servers and data center racks with multi-GPU setups, where a dozen GPUs could be installed in a single system. The RTX PRO 4500 Blackwell Server Edition GPU is similar, spec-wise, to the dual-slot actively cooled RTX PRO 4500 Blackwell, albeit with a lower power rating and slightly reduced memory bandwidth.

Performance-wise, NVIDIA touts its capabilities and efficiency compared to CPU-only systems, with the RTX PRO 4500 Blackwell Server Edition GPU delivering up to 50X higher performance for vector databases leveraging NVIDIA's cuVS technology for AI. And with fifth-generation Tensor Cores and NVIDIA's advanced video processing, vision-based applications will see an even greater performance increase than on CPU-only systems.
Here's a look at the full specs, including various FP4, FP8, and other AI performance metrics.
NVIDIA RTX PRO 4500 Blackwell Server Edition Specs
| Item | Details |
|---|---|
| GPU Architecture | NVIDIA Blackwell Architecture |
| CUDA parallel processing cores | 10,496 |
| NVIDIA RT Cores | 82 |
| FP4 Tensor Core | 1.6 PFLOPS |
| FP8 Tensor Core | 811 TFLOPS |
| FP16 / BF16 Tensor Core | 406 TFLOPS |
| TF32 Tensor Core | 203 TFLOPS |
| Single‑precision performance (FP32) | 51 TFLOPS |
| Peak RT Core performance | 154 TFLOPS |
| GPU memory | 32 GB GDDR7 |
| Memory interface | 256‑bit |
| Memory bandwidth | 800 GB/s |
| Power consumption | 165 W |
| Multi‑Instance GPU | Up to 2 MIG at 16 GB |
| NVENC / NVDEC | 3x, 3x |
| Confidential compute | Supported |
| Interconnect | PCI Express 5.0 x16 |
| Form factor | Single‑slot, FHFL (4.4" H x 10.5" L) |
| Thermal solution | Passive |
| Power connector | 1x PCIe CEM5 16‑pin |




