Features of the Northwood
The Intel Pentium 4 Northwood CPU has stepped up the L2 cache from 256Kbyte of Advanced Transfer Cache, or ATC as it is known, to 512Kbytes, running at the same speed of the CPU core. This gives the Northwood a clear advantage for high memory usage, especially with using DDR SDRAM and SDRAM model boards. While the L2 cache has grown over the Pentium 4 Willamette processor, the L1 cache has remained the same size.
Finally the end for the P6 AGL+T bus
For most of the past 3 years, Intel have been relying on the P6 bus used by the current P3 and Celeron range. While this bus has been easy to overclock and very stable, it doesn't have the scalability that is required for future processors. Intel has finally decided to step away from the P6 and introduce the new P4 400MHz QDR FSB.
The well-known 'FSB' of Pentium 3 is clocked at 133 MHz and able to transfer 64-bits of data per clock, offering a data bandwidth of 8 byte * 133 million/s = 1,066 MB/s. The Pentium 4's system bus is only clocked at 100 MHz and also 64-bit wide, but it is "Quad Data Rate", using the same principle as AGP4x. The new bus can transfer 8 byte x 100 million/s x 4 = 3,200 MB/s. This is obviously a tremendous improvement that even leaves AMD's EV6-bus far behind. The bus of the most recent Athlons is clocked at 133 MHz, 64-bit wide and "Double Data Rate", offering 8 byte x 133 million/s x 2 = 2,133 MB/s. Intel's Pentium 4 CPU is paired with the i850 chipset, a Dual Channel RDRAM solution.
The i850 has two independent RDRAM channels which can deliver up to 3.2GB/s max memory bandwidth when used with four RIMM modules. While RDRAM is able to produce such high bandwidth, its memory latency problems and high prices make it practically a dead issue for the home consumer. To this end Intel and other third party vendors have started to produce SDRAM and DDR SDRAM solutions to provide the Pentium 4 with lots of memory bandwidth goodness.
Rapid Execution Engine
Another feature of the Pentium 4 which is unique to Intel is the Rapid Execution Engine, or REE for short. The REE works on the principal of two double pumped ALU's and two double pumped AGU's. This allows for the engine to process 2x the amount of a P3 or Athlon CPU.
The story looks a lot different for the instructions that cannot be processed by the rapid execution units. Those instructions, or µOPs, need to use the one and only slow ALU's which is not double pumped. The majority of instructions need to use this path, which obviously sounds scary. However, the majority of code is in actual fact consisting of the most simple 'AND', 'OR', 'XOR', 'ADD', instructions, making Intel's "Rapid Execution Engine" design sensible though not particularly amazing. This feature has remained unchanged from the Willamette to the Northwood.
SSE2 or Netburst... Whatever you want to call it
Intel's name for the Pentium 4's new design is "NetBurst". Like with the Intel Pentium III and its SSE instructions, Intel is trying its hardest to push the idea that Intel's new processor will make your web pages load quicker. Unfortunately, Internet is mostly limited to your modem's maximum speed and the speed of your ISP. The average consumer, however, is not going to know this straight off and it is a perfect way to market the Pentium 4.
Another big issue with the Pentium 4's "NetBurst Micro Architecture" is its obvious focus to deliver the highest clock rates. Again, 'NetBurst' shows its roots in Intel's marketing department. While Intel in the past has said "MHz isn't everything", it seems that Intel is trying to ring that bell that they tried to cut down in the days of the Cyrix 6x86 CPU's. As many of you may know by now, the Intel Pentium 4 at the same clock speed can't beat an AMD Athlon in just about every benchmark today. While these benchmark programs aren't SSE2 optimized (yet), it does show that Intel is trying to focus more on the future and not on the present. This could be a very big marketing mistake with most of the hardware community staying away from expensive Pentium 4/RDRAM solutions at the moment. However, if you are one of the hardware junkies like myself who have to have the fastest thing with the highest numbers on it, Intel has taken this crown and continues to do so. At the time of this article, Netburst has allowed Intel to grab 2.2GHz well before AMD.
The big change... The Die
Intel's Pentium 4 Willamette is available in two packages; Socket 423 and Socket 478 while the Northwood is purely 478 only. While the 478 pin Pentium 4 may sound like it would be a larger CPU, it is actually smaller; about 1/3 the size of a 423 Pentium 4. mPGA pins are about the size of a pin head and spaced less than 1mm apart. Willamette was built on the same core process as its Coppermine P3 and Celeron CPUs were made with, a o.18 micron die. Intel has dropped the core size to that of the new Celeron Tualatin core, 0.13 micron. While the physical features of the Northwood are identical to the Willamette, under the heatspreader lies a tiny die. Consuming only 1.4 to 1.5v rather than the 1.7v that the Willamette core used, this has allowed greater clock speeds for current and future processors.