Intel's new Deep Learning Instruction Set
Many deep learning applications utilize matrix multiplication, and Intel has found a way to utilize int8 to produce more power efficient and more effective deep learning capabilities. Intel has been trying to break into the AI market as NVIDIA has been expanding into the market as well, so these technologies are foundational to Intel's future in the datacenter and AI market.
Vector-based matrix multiplication is traditionally done through FMA floating point operations, but with AVX512+VNNI Intel is also to utilize INT8 and produce performance on par with sing/dual precision floating point operations.
We should mention that the VNNI is actually based in hardware, it's not some software trick, but rather hardware was added just like Intel does for AVX. We can see that this hardware reduces the time it takes from 3 cycles to just a single cycle.
Here we can see how many MACs per cycle perform per different scenarios, and Intel also took note of power usage. We see lower power consumption with more performance than FP32 and more of a performance boost than FP32 at the same power.
Other scenarios show that L2 cache miss rates have decreased in certain scenarios, which is great for performance.
We also see that memory bandwidth constrained scenarios have decreased as well in certain cases.
PRICING: You can find products similar to this one for sale below.
United States: Find other tech and computer products like this over at Amazon.com
United Kingdom: Find other tech and computer products like this over at Amazon.co.uk
Australia: Find other tech and computer products like this over at Amazon.com.au
Canada: Find other tech and computer products like this over at Amazon.ca
Deutschland: Finde andere Technik- und Computerprodukte wie dieses auf Amazon.de