On being embarrassed

Posted: 05.22.06

manIf you write for a living, expressing your opinion and making forecasts, you can be expected to make a few mistakes. When that happens, it can be embarrassing. In the extreme it can be damaging to whomever you’ve inadvertently, and ignorantly (but hopefully never maliciously), misrepresented. Everybody makes comments and statements they wished they hadn’t—ask the president of the U.S., and the CEOs of some of our leading companies.

In computer graphics we have adopted an expression that I first heard from Nvidia, “Graphics is/are embarrassingly parallel.” Graphics does employ, or perhaps I should say, can employ, to great benefit, parallel operations, and does so. That is often expressed as “pipes,” a common term derived from pipeline and not really descriptive of how a GPU operates these days, but it is nonetheless a convenient and popular way to think about these operations. And so, in discussing, or bragging about, the “pipes” a GPU has, the bragger may use the metaphor of graphics being embarrassingly parallel as a way of justifying and simultaneously extolling the “pipi”ness of their latest design.

Well, I certainly agree. A “pipe” usually consists of at least one 32-bit floating-point processor (often inaccurately expressed as “32-bit IEEE floating point,” when in fact it is merely representative of the IEEE floating-point functions called for in DirectX9 and soon 10, not true IEEE floating point functionality). Many GPU designs have multiple floating-point processors (FPPs) in one pipe, and some even have a scalar or vector processor (SIMD) as well.

These FPPs are used to process first the coordinate data known as vertices and are strangely called vertex shaders, although the shading functionality at that stage is an obscure and esoteric concept. The other FPPs are used to process pixels and do what Pixar described originally as “shading.” And we are entering the debate about universal FPPs (commonly known as “shaders”) and the notion of load balancing, wherein if you are doing pixel operations, due the nature of pipeline processing, you probably are not doing vertex processing, and therefore if you have dedicated vertex processors they are just using up silicon and watts idling. So, the theory goes, if you can dynamically allocate your FPPs they can perform either task on an as-needed basis, and therefore you can theoretically apply more FPPs to the arduous task of pixel processing (where real shading goes on).

A company therefore might be embarrassed if it didn’t have universal shaders, and all the more so since Microsoft will, in DirectX 10, demand there be a universal shader model (I should note the term model really has nothing to do with the hardware implementation, and you can count on it being misused by marketing folks in GPU land).

But what of the emperor’s clothes? What if we had all those FPPs allegedly built for the embarrassing parallelism of graphics and there wasn’t any call for them? Now that would really be embarrassing. What if the ISVs, primarily the game developers didn’t take advantage of all those FPPS, those “pipes?” Absurd, you say; why, they can’t get enough of them. Indeed, that is the common wisdom. But if it were true, why do you suppose GPU providers would go looking for other things to do with those FPPs? Things like, oh, I don’t know, say, physics? Hmmm, now that would be embarrassing, wouldn’t it?

I can hear some of you saying now, Oh, come on, Jon, you know we live in the land of “we can do it therefore we will do it.” We don’t do or build things just because there’s a need for them. That is true, and I will confess I have pointed this out several times. But isn’t that also a little embarrassing? Think of how many products we’ve seen that could so something, but no one really cared to have that done and the product died.

I worry about the pixels, I hear their cries.

What if the pixels are being underserved due to the ISVs? What then will we do with all those FPPs? Oh, how embarrassing. Someone told me, well, physics is an embarrassing parallel operation, too, full of MACs and other computations that GPUs do extraordinarily well.

Hmmm, well, so are ray-tracing and voxel processing, and many other problems. And it is true that GPUs are being employed in massively parallel processing problems ( And so what? Because we can, we should?

If I’m playing a game, like say “F.E.A.R.” (which, BTW, I am doing for the second time, it’s that good), I want all the pixel-processing power I can get to make my Dell 30-inch 2580 x 1600 display (which “F.E.A.R.” sadly doesn’t fully utilize) light up and delight me. And I want all the bodies to fall and bounce when I terminate them. I don’t have any GPU to spare, and I’ve got two of them in my FX-60–based system.

Some of my friends (yes, I have more than one, thank you) asked me if I felt embarrassed about my comments about Ageia and said, “Just put the Nvidia/Havok demo side by side against Ageia at the same cost point.” Well, if I was down to my last $299 and had to choose between buying a second AIB or a physics accelerator, it would be a tough decision. If I wanted the scaling of GPUs that a second AIB offered, would I buy it (the second AIB) and then give it up for a little background physics work? Or would I use one AIB and a physics AIB? Most likely I’d go into debt and get all three.

I think we have an embarrassment of riches now in the PC gaming market. We’ve got powerful GPUs, high-performance chipsets that allow doubling up of GPUs, new dual-core CPUs (that can do a fine job on physics and AI), and ASIC physics engines. Some people want to make this a war between Ageia and the GPU suppliers, I find that embarrassing.