Tech content trusted by users in North America and around the world
7,219 Reviews & Articles | 52,137 News Posts

Intel Skylake Microarchitecture - High Level Info from IDF 2015

Intel Skylake Microarchitecture - High Level Info from IDF 2015
We attended IDF 2015 last week and learned more details about Intel's latest microarchitecture named Skylake. Here are the details for you.
By: Steven Bassiri | Intel CPUs in CPUs, Chipsets & SoCs | Posted: Aug 24, 2015 1:15 pm

Overall Core Microarchitecture Improvements

 

intel-skylake-microarchitecture-high-level-info-idf-2015

 

At IDF 2015 in San Francisco last week, Intel unveiled some high level details about their latest Skylake microarchitecture, and while this article will encompass a lot of architectural improvements, there are many more which cannot be disclosed at this time. For the sake of this article, I will go over the Intel presentation deck from the Skylake technical sessions on the core microarchitecture and the improvements to power delivery and savings. There are also some high level changes to the eDRAM that will be covered, but for the most part, the article will focus on the core rather than the graphics.

 

intel-skylake-microarchitecture-high-level-info-idf-2015intel-skylake-microarchitecture-high-level-info-idf-2015

 

 

While Haswell and Broadwell had a very broad range of configurations for all different types of SKUs, Skylake takes it even further with a broader TDP range and a wide range of die sizes. Intel also made major power improvements through tweaks rather than just throttling back frequency and performance. Intel made a point to mention that the details given here are for the client side of things, the server side could be totally different and information wasn't disclosed about the server microarchitecture. The information in this article encompasses everything from mobile to the desktop, and so many of the performance improvement vectors also focus on battery life improvements and form factor reduction.

 

intel-skylake-microarchitecture-high-level-info-idf-2015intel-skylake-microarchitecture-high-level-info-idf-2015

 

There are some things worth pointing out in this high level diagram. For starters, GT3 and GT4 graphics packages (Iris graphics) will both have eDRAM, so we should see an expansion of eDRAM across more SKUs. Some of the SoC versions (most likely mobile bound chips) will provide an integrated camera ISP (Image Signal Processor) to help camera performance on those devices. The audio DSP (digital signal processor) has also been improved, but so far we haven't been given too many details on that. Off the top Intel has produced a wider core, improved IPC, and greatly improved power efficiency. To support all of these improvements, the architects also improved the ring bus and LLC for improved throughput. Intel also made some security enhancements with new extensions.

 

intel-skylake-microarchitecture-high-level-info-idf-2015intel-skylake-microarchitecture-high-level-info-idf-2015

 

Intel focused heavily on the front end of the CPU. With Skylake, Intel has improved branch prediction, increased the number of execution units, widened instruction windows, improved load and store bandwidth, improved page miss handling, and improved buffers. While the branch predictor now has a higher capacity, both the prefetcher and branch predictor are also now smarter than before (improved accuracy). There were even improvements to Hyper Threading, resulting in a wider retirement. Intel also made improvements to encryption, speeding up both AES-GCM and AES-CBC extensions by 17% and 33%. I am not sure what that is in comparison too, but I would assume it's compared against Haswell/Broadwell, although a lot of other numbers thrown our way with Skylake were compared against Sandy Bridge. To further expand cache abilities, new extensions were added for the cache, and miss bandwidth was also improved.

 

intel-skylake-microarchitecture-high-level-info-idf-2015

 

With Skylake, Intel increased out-of-order Window from 192 uops in Haswell to 224, which is a big generational improvement; it should lead to improved parallelism, hence leading to better single threaded performance. The allocation queue (in other diagrams for Haswell it's the instruction decode que) has also been changed from 56 uop to 64 per thread uop (I assume 2 threads). This is quite a large expansion and interestingly enough we are seeing a shift back to threaded allocation queue. While Intel increased the integer physical register file from 168 to 180 registers, there is no increase to the floating point register file. There don't seem to be any improvements to the number of entries for in-flight loads, but there is a sizable improvement to in-flight stores, both of which refer to load and store buffers used for memory access.

    We at TweakTown openly invite the companies who provide us with review samples / who are mentioned or discussed to express their opinion of our content. If any company representative wishes to respond, we will publish the response here.

Related Tags

Got an opinion on this content? Post a comment below!
loading