This is certainly the age of AI and it appears that Google is looking at custom options now to help accelerate their machine learning algorithms, so they can be faster than ever. Pascal was just revealed to be an AI powerhouse, but even that doesn’t seem to be very competitive compared to the custom silicon that Google has created; the Tensor Processing Unit.
Google’s TPU AI co-processor could potentially be many hundred times faster than the modern competition
And that’s precisely what they want. They’re working on a targeted approach to AI that could potentially make machine learning an almost real-time activity. Google’s Tensor Processing Unit is specifically built on an unknown process and tailored specifically to how the machine learning library TensorFlow works. The chip is small and focused, meaning that the pure FP16 workloads it sees can run much faster. They say that it’s already improving the relevancy of search results as well as Street View. Apparantly the AlphaGo AI that beat the human world champion was powered completely by by these TPU processors.
The development of an ASIC for machine learning is the logical next step in looking for faster and better AI. With the right low-latency connector, be it via 10 GigE, InfiniBand or a PCIe-based connection, it could potentially provide a throughput that far exceeds even NVIDIA’s DGX-1 deep learning platform with far lower power consumption.
That being said, this could potentially be a grave threat to NVIDIA’s plans to saturate the deep learning market. The Tesla P100 is a very large chip with a total of 15.3 billion transistors arranged within a 610mm2 area. Those transistors, while they do contain the FP 16 capability needed for competent AI calculations, are not quite so targeted. They’re still general purpose in design and thus are not necessarily going to be nearly as capable as a design that’s specific to AI. Though, yes, the inclusion of NVLink, HBM2 and the general improvements in Pascal do make it a very competitive and impressive design, it still cannot compete with a device that’s literally purpose-built and not general purpose in nature. NVIDIA may have taken a misstep with this design if Google and partners are already looking elsewhere for better performance-per-watt and per-dollar.