Wednesday, December 15, 2010

Opening a can of...wrist slapping

I put off writing another rambling in anticipation of the upcoming Caymen GPU release, figuring nothing else was happening in the technology world worth writing about as much. Well, guess what? It released.

And it's about damn time too. About a month delayed (plus a NDA lift postponed a few days), we have the 6900 family, a sub-group of the 6000 order tailoring exclusively to the highend portion of the market. AMD can hardly be faulted for a comparatively minor delay next to NVIDIA's pitfalls, but when all is said and done, this amounts more to a swift kick to the shin than an ass whoopin'.

When the GTX 580 came out about a month ago, many reviews concluded with a teaser about what to expect with AMD's response. HardOCP stated in their review:

If it were possible, we would have compared the GeForce GTX 580 to the new 6900 series GPUs from AMD. Will NVIDIA be able to stay ahead? AMD has been at NVIDIA’s heels very closely this year and certainly has excelled in the performance per watt landscape for a good while. The GeForce GTX 580 surpasses the Radeon HD 5870, but the GTX 580's real competition is going to be the Radeon HD 6900 series, and it is not far off.
Reading statements like that might make you feel like the 6970 may end up being something really special. Certainly a lot of people would love to see AMD compete with NVIDIA on their own turf, especially given NVIDIA's been left with a lot more strategic openings in recent years, while AMD's been on a roll with successful launches. But the 6970 simply doesn't compete with the GTX 580. Many will say that it was never meant to; that's the job of the 6990. I'd counter that by saying that AMD's Crossfire generally sucks, and for the most part so does depending on multi-GPU solutions in general. If you aren't stuck waiting for drivers with updated profiles on new game releases, you're contending with microstuttering issues that make an impressive framerate on paper feel half as good in practice (which, like it or not, is more a bi-product of the technology itself and isn't likely to ever go away). The 6970 does manage to improve upon its 5870 predecessor by around 20% on average, which is a fair gain on the same manufacturing process. It puts it a little over the GTX 480 using a smaller die, which might as well be the final nail in the GF100's coffin. But the GTX 580 still bests it by around 10% (often higher in really GPU-dependent scenarios), which, to be fair, will cost you ~$150 more if you can find one in stock, which seems to be a challenge right now.

The 6970 is at least interesting, though, from the perspective of a technology enthusiast such as myself who's been following the industry for some years, and holds a fascination with the inner-workings on how a piece of computer hardware gets to where it's going. The rumors on the 6800 series that I addressed a couple ramblings back turned out to at least be half true in regards to the 6900 series, where things start to suddenly diverge in terms of architecture. While not a ground-up reworking of the engine, things have been tweaked on a fundamental level for the first time since the Xbox 360's Xenos GPU was developed, in order to make them more efficient (and in many ways, closer to NVIDIA). It would have been nice to have a separate shader clock, but that may be asking too much anyway. What we do have is the VLIW4 layout that narrows down the width of the shaders in favor of more threads and symmetrical SP units. The actual gains of such a change simply aren't going to be night and day, as in the case of the 5870 vs the 6970, you're able to process 320 threads compared to 384 respectively. Not a huge increase, but addressing the abysmal utilization of the R600 shader architecture to any degree is desperately important.

On top of that, something that comes as a complete surprise, the triangle setup rate has been increased ala GF100, breaking up the rendering engines into completely separate cores, though not to the extent of Fermi. Hell, even the multiple kernel dispatching was added, and even better than NVIDIA did, which is a real slap to them because they could have really benefited from the asynchronous dispatch feature with their PhysX push (but who cares because PhysX is as good as dead anyway). Essentially this might mean AMD can pull off things like morphological AA in a computer shader instruction at the same time as they're doing pixel and vertex and every other kind of shader, without losing cycles to context switching every time. As game engines continue to become more multithreaded as the features of DX11 get more use, all the different shader types and compute instructions will grow in their need for parallel execution, and it's pretty cool that AMD is addressing this, and even more cool in regards to GPGPU, where AMD hadn't really given much attention before.

The release of the 6970 does further highlight the silliness of this generation's naming convention. While the 6870 was already pretty poorly named in so much as its performance and and design related to previous generations, the 6970 actually has a new architecture, and here it shares its nomenclature with Barts GPUs essentially using a last-gen architecture. The last time I remember something like this happening was when NVIDIA released the GeForce 4 series, with the excellent Ti4000 cards and the ludicrous MX400 mainstream cards using slightly tweaked GeForce 2 architectures. While not as bad as that situation was (they're all still DX11-capable), it begs the question of what the 6800 series brought to the table to warrant inclusion into the new generation when the 6900 series was going to bear the real fruit of the new product line?

All-in-all, I'm rather impressed with what AMD was able to pull off in such short notice given the reigns on the 32nm process were so quickly pulled without a viable alternative within a reasonable timeframe. NVIDIA managed to take a failed design and tweak it to make it work, but it was still basically their old GPU made new again. The 6970 is completely new, but definitely feels like a prelude to what was meant to be the real deal--a major birthing cut premature by strenuous circumstances. We won't see what the architecture was really capable of until we get a smaller process, but for now, what we have is respectable, if not a little underwhelming in some regards as graphics card releases have tended to be in recent times. Performance has indeed been stagnating some, and with a need for some reinvigoration in the market, 28nm can't come soon enough.