AMD today just announced a new set of GPU accelerators that are especially designed for deep learning. Radeon Instinct is a almost their answer to NVIDIA’s own DGX GPU compute server. This officially brings Polaris into the server world. Perhaps more importantly, it gives us our first glimpse at AMD’s upcoming Vega, which is the star of this show.
Vega makes a debut with AMD’s Radeon Instinct
Deep learning is very quickly becoming a large and important field. AMD hasn’t quite paid full attention to the segment, or at least didn’t appear to be doing as much here as they very well could. As a smaller segment of business, that does make sense, however. Radeon Instinct aims to change their focus in the deep learning and neural networking market.
AMD has been very open with ensuring GPU support for the latest algorithms and backend software. They’ve created a large presence in the open-source community and have solutions to integrate into the latest open-source DNN’s or help you create your own. The software side is sorted, and well supported, so now they’re focusing more effort on ensuring the proper hardware is available to compete with their largest competitors.
It’s taken some time to truly realize the potential that the DNN is, and Radeon Instinct is AMD’s answer. So what is it? What it isn’t is a cobbling together and rebranding of already familiar GPUs into a different package. This is, in effect, a complete optimized hardware and software stack to enabled more efficient deep learning. They’re leveraging their newly updated ROCm, Radeon Open Compute Platform, in a whole package approach.
The GPUs themselves are indeed a series repackaged FirePro S cards, so you will be familiar with two of the new cards being announced today. The third is new and very exciting. The MI6 and MI8 are Polaris 10 and Fiji based, respectively. MI25 is based off of Vega, and seems to be quite the powerhouse of a GPU. They’re using three different architectures to give three different possibilities depending on your needs and budget.
The Polaris-based card, MI6, will include 16GB of GDDR5 and the rest will be a full-fledged Polaris 10 die. That means 5.7 TFLOPS of single precision and 5.7 TFLOPS of half precision. The Fiji-based card, the MI8, will be limited to 4GB of HBM1 though will have 8.2 TFLOPS of single precision and 8.2 TFLOPS of half precision computing power. The same as a full-fledged Fiji die.
The Vega GPU, MI25, is certainly the head-turner of the group. We’re unsure of how large the VRAM will be, though it’s said that it’ll be high-bandwidth and large with a very fast controller. This points towards the use of HBM2. It’s estimated to have around 12.5 TFLOPS of FP32 and 25 TFLOPS of FP16 compute power, which is quite high. This could mean that, similar to NVIDIA’s Tesla P100, this particular incarnation of Vega is not representative of the consumer variant, and is built specifically be compute-oriented. We won’t be seeing something quite like this at CES for us consumers and gamers. Though the PlayStation 4 Pro does include FP16 operations, so we may see this indeed.
Instinct is an apt name for what they’re marketing here. These cards, the MI25 specifically, will be used to learn and train programs to have a sort of instinct for the request that us humans will make.
To that end they have the entire software package available to help once these are released. MIOpen integrated into Caffe, Tensorflow and the like should offer a significant advantage over their previous offerings. They’ll also have the Instinct software stack that’ll take better advantage of the hardware on a more fine-grain level.
AMD is also looking to combine their Instinct GPUs with Zen’s Naples platform to allow for a more cohesive heterogeneous platform. Remember, Zen offers 64 PCIe lanes per CPU, a significant advantage over the current 40 lanes from Intel’s top Xeon offering. For the moment there are a few OEM’s that are offering complete Instinct packages. SuperMicro and Inventec are offering chassis that are capable of holding up to 120 Instinct GPUs, which could potentially be around 3 PFLOPS of single precision compute power.
What does this mean for the future? Hopefully it helps introduce more competition to help fuel innovation in GPU technologies and even help inject more enthusiasm for the open-source software stacks that can power these machines. A nice side effect of competition could also be lower prices on both sides. Or all three sides as Intel begins making their Phi accelerators more available. AMD has a lot of ground to cover to be more competitive in the DNN market, though they’ve made great strides on the software front. Now Vega seems to be able to pick up the slack on the hardware side.
Radeon Instinct could, if developers want to switch from whatever platform they’re using, be a very big boon to both AMD and us. Better applications, better AI and better DNN’s. Plus better technology as a result of the the growth in this burgeoning field.