Computing with GPUs and Cells

Posted: 11.01.06

They’re not “Gs,” they’re “Ss”

superIt’s pretty exciting to think of a supercomputer in a chip, but then I guess that all depends on what you’re calling a supercomputer, and I doubt we’d get much consensus. The Computer Desktop Encyclopedia (published by the Computer Language Company) says a supercomputer is:

The fastest computer available. It is typically used for simulations in petroleum exploration and production, structural analysis, computational fluid dynamics, physics and chemistry, electronic design, nuclear energy research and meteorology. It is also used for real-time animated graphics.

And Wikipedia says:

A supercomputer is a computer that leads the world in terms of processing capacity, particularly speed of calculation, at the time of its introduction. The term “Super Computing” was first used by New York World newspaper in 1920 to refer to large custom-built tabulators IBM made for Columbia University.

So, like art, what a supercomputer is seems to be in the eye of the beholder. However, there are a group of people who like to define supercomputers on floating-point operations per second (FLOPS), and when they get a lot of them, a prefix is applied as in GFLOPS. There is also a benchmark known as The Linpack Benchmark, which was introduced by Jack Dongarra in 1978 or 1979, and it’s a collection of Fortran subroutines for solving various systems of linear equations.

Supercomputers come in various classes, too; there are shared memory systems, SIMD and MIMD systems, distributed memory, ccNUMA, and cluster systems. And, as you might imagine, not all of the top supercomputers are commercially available. The top five supercomputers right now, according to the Top 500 website (, are shown in the table on the preceding page.

The No. 1 machine is the BlueGene/L System, a joint development of IBM and DOE’s National Nuclear Security Administration (NNSA) and installed at DOE’s Lawrence Livermore National Laboratory in Livermore, CA. BlueGene/L occupied the No. 1 position on the last three Top 500 lists. It has reached a Linpack benchmark performance of 280.6 TFLOPS and is the only system ever to exceed the level of 100 TFLOPS. This system is expected to remain No. 1 for the next few editions of the Top 500 list.

However, as mentioned, to get a Linpack measurement you have to run Fortran code. Now GPUs by themselves can’t run Fortran. But they can be hooked up with a GP processor that can. However, the Cell with its Power CPU front end could run Fortran, and the PS3 has been announced as having a theoretical 2.18 TFLOPS while an Nvidia 7800 GTX 512 is capable of around 200 GFLOPS, and ATI’s X1900 architecture has a claimed performance of 554 GFLOPSs. And given that the GF8800 is at least 2X the 7800, we can assume it will come in around half a TFLOPS—since it can’t be measured, it has to be calculated.

ATI with Peak Streaming, and Nvidia with its CUDA, are going to apply GPUs into the scientific computing space (see CUDA article, beginning on p. 1 of this issue). And there will be a lot of marketing spin associated with it.

One of my peeves is the nomenclature of GP-GPU—General-Purpose Computing on Graphics Processing Unit. These are not general-purpose processors, these are SP-GPUs—Specific-Purpose Computing on Graphics Processing Units—very specific. But that’s a marketing battle I won’t win, so why bother? One reason to bother is because I find I have to explain to the press, investors, and even industry companies that a GPU will not run x86 code. And the same is true for the vaunted Cell (although it’s been suggested that with recompiling it could).

Nonetheless, the marketing spin is not the issue. What’s cool is that the 128 floating-point processors in the G80, and probably a like number in the forthcoming R600, can, and will, be used for things other than just polishing pixels. But, you have to keep your perspective. From a hardware point of view they are very cost effective per FLOPS, but they are not going to be found in all new supercomputers, and as pointed out they are not supercomputers per se unto themselves. They are co-processors, just like the Cray Computer SIMDs that are hung on the side of the AMD processors in the Red Storm supercomputer at Sandia.

But I like the bragging rights. “Yeah, got a couple of G80s in this puppy—supercomputers on a chip, y’know. Yep, yep, that’s what I use to play games, a gol’danged supercomputer.”