If you write for a living, expressing your opinion and
making forecasts, you can be expected to make a few mistakes. When that
happens, it can be embarrassing. In the extreme it can be damaging to whomever
you’ve inadvertently, and ignorantly (but hopefully never maliciously),
misrepresented. Everybody makes comments and statements they wished they
hadn’t—ask the president of the U.S., and the CEOs of some of our leading
In computer graphics we have adopted an expression that I
first heard from Nvidia, “Graphics is/are embarrassingly parallel.” Graphics
does employ, or perhaps I should say, can employ, to great benefit, parallel
operations, and does so. That is often expressed as “pipes,” a common term
derived from pipeline and not really descriptive of how a GPU operates these
days, but it is nonetheless a convenient and popular way to think about these
operations. And so, in discussing, or bragging about, the “pipes” a GPU has,
the bragger may use the metaphor of graphics being embarrassingly parallel as a
way of justifying and simultaneously extolling the “pipi”ness of their latest
Well, I certainly agree. A “pipe” usually consists of at
least one 32-bit floating-point processor (often inaccurately expressed as
“32-bit IEEE floating point,” when in fact it is merely representative of the
IEEE floating-point functions called for in DirectX9 and soon 10, not true IEEE
floating point functionality). Many GPU designs have multiple floating-point
processors (FPPs) in one pipe, and some even have a scalar or vector processor
(SIMD) as well.
These FPPs are used to process first the coordinate data
known as vertices and are strangely called vertex shaders, although the shading
functionality at that stage is an obscure and esoteric concept. The other FPPs
are used to process pixels and do what Pixar described originally as “shading.”
And we are entering the debate about universal FPPs (commonly known as
“shaders”) and the notion of load balancing, wherein if you are doing pixel
operations, due the nature of pipeline processing, you probably are not doing
vertex processing, and therefore if you have dedicated vertex processors they
are just using up silicon and watts idling. So, the theory goes, if you can
dynamically allocate your FPPs they can perform either task on an as-needed
basis, and therefore you can theoretically apply more FPPs to the arduous task
of pixel processing (where real shading goes on).
A company therefore might be embarrassed if it didn’t have
universal shaders, and all the more so since Microsoft will, in DirectX 10,
demand there be a universal shader model (I should note the term model really
has nothing to do with the hardware implementation, and you can count on it
being misused by marketing folks in GPU land).
But what of the emperor’s clothes? What if we had all those
FPPs allegedly built for the embarrassing parallelism of graphics and there
wasn’t any call for them? Now that would really be embarrassing. What if the
ISVs, primarily the game developers didn’t take advantage of all those FPPS,
those “pipes?” Absurd, you say; why, they can’t get enough of them. Indeed,
that is the common wisdom. But if it were true, why do you suppose GPU
providers would go looking for other things to do with those FPPs? Things like,
oh, I don’t know, say, physics? Hmmm, now that would be embarrassing, wouldn’t
I can hear some of you saying now, Oh, come on, Jon, you
know we live in the land of “we can do it therefore we will do it.” We don’t do
or build things just because there’s a need for them. That is true, and I will
confess I have pointed this out several times. But isn’t that also a little
embarrassing? Think of how many products we’ve seen that could so something, but no one really cared to have that
done and the product died.
I worry about the pixels, I hear their cries.
What if the pixels are being underserved due to the ISVs?
What then will we do with all those FPPs? Oh, how embarrassing. Someone told
me, well, physics is an embarrassing parallel operation, too, full of MACs and
other computations that GPUs do extraordinarily well.
Hmmm, well, so are ray-tracing and voxel processing, and
many other problems. And it is true that GPUs are being employed in massively
parallel processing problems (www.gpgpu.com). And so what? Because we
can, we should?
If I’m playing a game, like say “F.E.A.R.” (which, BTW, I am
doing for the second time, it’s that good), I want all the pixel-processing
power I can get to make my Dell 30-inch 2580 x 1600 display (which “F.E.A.R.”
sadly doesn’t fully utilize) light up and delight me. And I want all the bodies
to fall and bounce when I terminate them. I don’t have any GPU to spare, and
I’ve got two of them in my FX-60–based system.
Some of my friends (yes, I have more than one, thank you)
asked me if I felt embarrassed about my comments about Ageia and said, “Just
put the Nvidia/Havok demo side by side against Ageia at the same cost point.”
Well, if I was down to my last $299 and had to choose between buying a second
AIB or a physics accelerator, it would be a tough decision. If I wanted the
scaling of GPUs that a second AIB offered, would I buy it (the second AIB) and
then give it up for a little background physics work? Or would I use one AIB
and a physics AIB? Most likely I’d go into debt and get all three.
I think we have an embarrassment of riches now in the PC
gaming market. We’ve got powerful GPUs, high-performance chipsets that allow
doubling up of GPUs, new dual-core CPUs (that can do a fine job on physics and
AI), and ASIC physics engines. Some people want to make this a war between
Ageia and the GPU suppliers, I find that embarrassing.