Intel's new Deep Learning Instruction Set
Many deep learning applications utilize matrix multiplication, and Intel has found a way to utilize int8 to produce more power efficient and more effective deep learning capabilities. Intel has been trying to break into the AI market as NVIDIA has been expanding into the market as well, so these technologies are foundational to Intel's future in the datacenter and AI market.
Vector-based matrix multiplication is traditionally done through FMA floating point operations, but with AVX512+VNNI Intel is also to utilize INT8 and produce performance on par with sing/dual precision floating point operations.
We should mention that the VNNI is actually based in hardware, it's not some software trick, but rather hardware was added just like Intel does for AVX. We can see that this hardware reduces the time it takes from 3 cycles to just a single cycle.
Here we can see how many MACs per cycle perform per different scenarios, and Intel also took note of power usage. We see lower power consumption with more performance than FP32 and more of a performance boost than FP32 at the same power.
Other scenarios show that L2 cache miss rates have decreased in certain scenarios, which is great for performance.
We also see that memory bandwidth constrained scenarios have decreased as well in certain cases.
PRICING: You can find the product discussed for sale below. The prices listed are valid at the time of writing, but can change at any time. Click the link below to see real-time pricing for the best deal:
United States: Find other tech and computer products like this over at the Amazon website.
United Kingdom: Find other tech and computer products like this over at the Amazon UK website.
Canada: Find other tech and computer products like this over at the Amazon Canada website.