AMD gets back in the development game with OpenCL tools and drivers
Recently I wrote a pair of articles that got more than a few people upset. Both were concerned with the way NVIDIA and AMD do business and how this affects gaming. I also touched on the differences in architecture between the two. Now, while these were not an attempt to say one was better than the other, many people seemed to feel that because I think that NVIDIA is using a smarter architecture model and that their 'The Way Its Meant to Be Played' program is good for gaming, I was pro NVIDIA.
I am sure the people at NV would be interested in hearing this as I have been often accused of "bashing" them by their PR team, but that is for another article. This article is going back to the same two articles where I talked about how both AMD and NV approach the same item; gaming. In both I talked about the way the most current GPU from AMD (the 58xx series), while very fast, is still lacking due to fundamental design issues. While it will not affect many games, it does hinder it when it comes to processing anything other than graphical information.
A perfect example of this is DX11 Direct Computing or any other standard GPGPU work load (the same type that would be used by a Physics or AI engine). If you look at the way the 5870 handles the Direct Compute bench from NGHQ, you will see the DX10 GTX 285 walk all over it when running the DC bench. However, recently AMD began working very closely with Sisoft to produce an OpenCL extension for their GPUs. This new work with Sisoft helped AMD to produce OpenCL results that simply embarrass the GTX 285 and even the GTX 295.
We had the chance to talk to AMD about this and get a little peak inside to see how and why we see this in one bench, but not in others.
But before we get too far, let's take a look at how things work. According to AMD, the 5xxx series is built around the Vec5D shader unit. This provides a cluster of 5 shaders; one is able to handle Fat or complex code while the other four can handle lite or simple code. This means that overall you have 320 "fat" shaders and 1280 "lite". I know for most of you, you are wondering what this has to do with anything and why it matters to gamers.
The issue at hand is something I have talked about before. It is the architecture that is the key point here. Imagine it like this, you have a board with holes in it, but the holes are not all the same size. 1/5th of these are large, while the rest are small. Then you drop nothing but large balls onto that board. It is going to take a while to get them all through as they will not fit through the smaller holes and you will have to jiggle the board around to get them to the right spot where they can fit and pass through. Conversely, if you drop nothing but small balls they go through just fine and very quickly, no juggling needed.
This is a very simplistic way of looking at the AMD Vec5D architecture. Yes there is more to it, but in principle this is the effect you get. In the Vec5D you have one Shader that is able to execute any type of code; Fat, Thin, Complex, Simple, it can do them all. It is like the leader in the cluster. The other four are less powerful; they can only execute a limited number of instruction types and are constrained by the complexity of the code they can handle. AMD did make sure they can execute most types of graphical information, but even then there are limits to what they can do.
So again you have to ask, if this is true then why is AMD stomping NV in gaming performance? That one is again easy to answer. AMD (as I stated above) made sure that the smaller four shaders could handle some of the complex graphical instructions that come through. But these are still smaller "blocks" of code in the majority of games. Now, as this is true for most games, AMD will enjoy a healthy boost over NVIDIA in games that follow this general practice. Where the monkey wrench gets thrown in is when games are not coded like this. Let's say that a game is coded and compiles in large and complex blocks.
Take a game like Dead Space; this one is heavily optimized to run on NV GPUs and even against the massively powerful and fast HD 5970, an NV GPU can keep up and outperform them once you turn on the AA and other effects. This is because it was compiled differently from most games. To the end users you will not notice anything different. The game looks the same on both but the performance will show when you run something like FRAPS. This is also evident in other games, especially titles that are part of the NVIDIA's 'The Way it's Meant to be Played' program. It is also the reason that many people feel this program is anti-competitive.