NVIDIA has just announced the Quadro P100 which, as the name implies, is a workstation variant of the powerful GP100 chip complete with double FP16 and 1/2 FP32 capability. The new GPU is being targeted at the very high-end of the workstation crowd that depend highly on the absolute maximum compute capability available.
Quadro GP100 rounds out the very top-end
And like the server-oriented P100 version, this is meant to be used for compute heavy tasks. They intend this to be the solution to the problem of individual problems that need to be solved, rendering CAD simulations as quickly as possible and likely in memory, though it’s limited to 16GB of HBM2. Unlike the server variant, the Quadro GP100 actually has graphics capability, meaning it can render images and send them to the monitor.
NVIDIA Tesla P100 Family |
||||
---|---|---|---|---|
Quadro GP100 (16GB) | Tesla P100 (16GB) | Tesla P100 (DGX-1) | Quadro P6000 | |
CUDA Cores | 3584 | 3584 | 3584 | 3840 |
Texture Units | 224 | 0 | 0 | 240 |
ROPs | ~128 | 0 | 0 | 96 |
Boost Clock | ~1430MHz | 1300MHz | 1480MHz | ~1560 |
Memory Bus Width | 4096-bit | 4096-bit | 4096-bit | 384-bit |
Memory Clock | 1.4Gbps HBM2 | 1.4Gbps HBM2 | 1.4Gbps HBM2 | 9Gbps GDDR5X |
VRAM | 16GB | 16GB | 16GB | 24GB |
Double Precision | 1/2 FP32 | 1/2 FP32 | 1/2 FP32 | 1/32 FP32 |
Half Precision | ~20 TFLOPS | 18.7TFLOPS | 21.2 TFLOPS | ~11.5 TFLOPS |
Single Precision | ~9.8 TFLOPS | 9.3 TFLOPS | 10.6 TFLOPS | 11.5 TFLOPS |
Double Precision | ~5 TFLOPS | 4.7 TFLOPS | 5.3 TFLOPS | 375 GFLOPS |
TDP | 235W | 250W | 300W | 250W |
The VRAM, though immensely fast and HBM2, is limited in the amount it can physically accept on board. That said, NVIDIA is also introducing an NVLink connector along with their GP100 so that their ultrfast fabric is available on the workstation. That means that all GPUs utilizing the fabric share their VRAM in one large pool. So far it only supports a 2-way NVLink configuration. Though expensive, that can help alleviate any VRAM size and bandwidth issues. They haven’t exactly said how many ROPs are present with the GP100, though due to the market they wish to target. Those are sometimes rather reliant on ROPs, so we think 128 would be nominal. If so, that’ll mean that the Quadro GP100 will be around 20% faster than the P6000 (and Titan X Pascal) at graphics related duties. Including games.
The Quadro GP100 is going to release in March of 2017 with an expected price somewhere in the neighborhood of $5000. So don’t expect to see this actually used for gaming, except by the very hardcore with more money than sense.