Blog

One more feel of the elephant

With all the rumors and information coming out of Taiwan and IDF Shanghai about Intel’s plans, I decided it was time to wipe the palm prints off the crystal ball and have another go at the 5 megaton elephant in the room. That elephant is, of course, Intel’s Larrabee processor. Now, some think it’s going to be a GPU killer, ...

Robert Dow

With all the rumors and information coming out of Taiwan and IDF Shanghai about Intel’s plans, I decided it was time to wipe the palm prints off the crystal ball and have another go at the 5 megaton elephant in the room. That elephant is, of course, Intel’s Larrabee processor.

Now, some think it’s going to be a GPU killer, and Pat Gelsinger didn’t help dispel the idea when he said it would be on an add-in board.

All that got me thinking; that and the fact that AMD and Nvidia have a hybrid solution and Intel doesn’t (yet). And, as some of you too, too painfully know, WDM (Windows Device Manager) only allows one GPU driver—this prohibits heterogeneous hybrid operation. AMD and Nvidia can use a unified driver and enable an IGP and a GPU; Intel can’t.

Figure 1: Possible Nehalem/Larrabee system organization. (Source: Jon Peddie Research)

With Fusion, AMD can still do hybrid, but Nvidia is cut out of AMD space for hybrid operation in that model.

With Nehalem, Intel can’t do hybrid with someone else’s external GPU if Intel puts their GPU in the CPU package. Will Larrabee help? We think yes.

In fact, we think we’ve finally divined the architectural organization of Larrabee as illustrated in the diagram.

This organizational approach would solve the WDM single driver issue, and allow Intel to offer a Hybrid solution to compete with AMD’s and Nvidia’s offerings. It also removes the necessity of making different types of Larrabees, and could happen with Larrabee just having CSI links and using a PCIe-to-CSI bridge on an AIB version—all win-win for Intel.

This configuration is the best model (I can think of right now) for Intel to offer a competitive challenge to AMD and Nvidia. Larrabee will be able to run C coded shader programs and feed the results to the IGP (for display). With CSI and PCIe 2.0, bandwidth and latency shouldn’t be too big of a problem. Polygonal models and transformations can be done in the CPU and IGP letting the combo chip potentially work within DirectX and Open GL environments and deliver shiny objects—oh boy.

Larabee—who needs it?

To quote Mr. Paul Otellini, CEO of Intel, “…among the applications for Larrabee, one of them is high-end graphics.” He also said “Graphics will also be an area for the chip.” And aside from one flub, he and the company have avoided saying it is a discrete GPU. In other conversations with Intel folks, they have made the point quite clearly that they (AMD and Nvidia) have their approach [a GPU architecture] and we [Intel] have ours [a swarm of x86 processors] further suggesting that it is not a discrete graphics chip but a general-purpose processor array—a multi-core if you will (see story on the Multi-core Conference this issue.)

Figure 2: Intel, the 5MT elephant in the room, prepares to blast AMD and Nvidia. (Source: Rocketworld)

One of the showcase applications for Larrabee will be ray-tracing, and we got a strong hint about that at the last IDF when Daniel Pohl showed off his port of the Quake IV engine to Intel’s real-time ray-tracer that was running on 16 processors. And, popular speculation is that Larrabee will have 16 processors—not a coincidence.

However, Intel is not building Larrabee to just run games. Ray-tracing is a big deal in Hollywood and TV commercials, as well as design studios for everything from airplanes to cars, to xylophones. And ray-tracing is a nice scalable problem that just gets better as more processors are added (due the exponential combinatorial nature of rays being influenced and generated by the environment.) And, although I’m fond of saying that in computer graphics too much is not enough, there are asymptotes of practicality if not actual limits—i.e., good enough. In ray-racing, with modern efficient programs (like Nvidia’s Mental Images), you can get excellent results in a reasonable time period with eight processors and faster results with 16. That being the case, with eight processor multi-core CPUs in sight and 16 on the horizon, why use a dedicated coprocessor with its inherent API issues, when you can use one or two CPUs? This will be one of the challenges Intel faces as it tries to promote the notion of visual computing.

FYI, the first usage of the term, Visual Competing, was in 1995 when SGI’s CEO Ed McCracken received the National Medal of Technology from President Bill Clinton “for his groundbreaking work in the areas of affordable 3D visual computing.” More recently SGI used the term “visual computing” in 2004 when it introduced the Linux based Prism line, and then Nvidia used the term Visual Computing, in 2006 when it introduced the QuadroPlex 1000. But Intel has formed a division with the term and adopted it as their differentiator, so you can expect the world to soon assume that Intel invented it. You can do stuff like that when you’re a 5 MT elephant.

What If?