Reviewing the HP Z400 workstation and Fermi-generation Nvidia Quadro 5000

Posted by Alex Herrera on August 4th 2010 | Discuss
Categories: Hardware Review,

Our first look at 6-core Westmere (Gulftown) and professional-grade Fermi

Figure 1: Our Z400 cuts build costs but retains the line’s industrial look and feel (Source: Jon Peddie Research)

Two major advancements in workstation platform technology have appeared over recent months, and JPR has had the good fortune to review both, and at the same time no less. Intel’s 32 nm Westmere processor generation made its first splash in early Q1, appearing as both mobile Arrandale and desktop Clarkdale, two dual-core CPUs with in-package 45 nm graphics controllers. The two were the first processors to market to encapsulate both CPU and GPU in a single package (though not a single die).

A more recent member of the Westmere family is Gulftown, a part which passes on the integrated GPU in favor of not two but four more CPU cores.

We got a Gulftown chip a few months ago and we’ve been using it as our test platform for benchmarking enthusiast graphics AIBs.

With a total of six CPU cores, it’s Gulftown (and its dual-socket sibling Westmere-EP) that offers the real appeal of Westmere for workstations. The extra CPU cores pay off, while Westmere’s integrated GPU generally wouldn’t cut it for workstation applications anyway, supplanted instead by a discrete GPU.

But perhaps even more anticipated for buyers of professional-caliber hardware is Fermi, Nvidia’s latest generation of GPU technology. Shipping first under the GeForce banner (GTX 480 and 460), Fermi is finally out of the bag for professionals, with Nvidia releasing the first Quadro-family Fermi GPU products: the ultra-high end Quadro 5000 and Quadro 6000, along with the high end Quadro 4000 officially at Siggraph.

The specs on our review model

The Z400 HP loaned us came with the fastest single-socket Gulftown Intel makes, the 3.33 GHz W3680, with 12 MB total cache. The Z400 offers up 6 DIMM slots, allowing a maximum of 24 GB of DDR3 memory (up to 1333 MHz), though we were quite content with our 12 GB. Add them up, and our test rig came fairly maxed out, especially when considering the Quadro 5000 installed. All told, our system would list for $3,897, and that’s without counting the $2,249 list-price Quadro 5000. So we’re really not talking an entry level machine here, but rather a mid-range workstation. 

Table 1 Configuration specifications for our HP Z200 review machine (Source: Jon Peddie Research)




Intel Xeon W3680, 6-core @3.3 GHz


Intel 5520 (“Tylersburg”)


12 GB of 1333 MHz DDR3


2  x 500 GB (7200 rpm SATA)


Quadro 5000 with 2.5 GB GDDR5


Windows 7


starting at $999

Figure 2: The system topology for our specific Z400 configuration (Source: HP) Nicely designed, but don’t compare to the Z800

Just as we noted on our review of the lower-end Z200 (see the HP chapter), HP couldn’t justify the same attention to detail on the Z400 as it did on the top-end Z800. The Z400’s much more aggressive price point simply won’t allow it. However, that doesn’t mean HP threw in the towel on interior and exterior design. For example, the Z400’s exterior takes a step up from the Z200 with stylish cold-rolled steel sides and top.

And like the Z200, the Z400 doesn’t include the premium chassis components of the Z800. But HP again shows it can find ways to add engineering value without having to add much to the bill of materials. The black cowl in the top–center of the chassis is one example. An inexpensive piece of plastic, the cowl serves two purposes: it neatly holds cables out of the way while also helping channel air flow effectively through the chassis. And like all its Z siblings, the Z400 offers effective green-coded tabs for tool-less access to card slots and drive bays.

Figure 3: Tidier internals lead to better airflow (Source  Jon Peddie Research)

Like any good workstation, the box is not at a loss for I/O, both front and back. The Z400’s front panel provides access to three external bays (including a Blu-ray writer option), audio, two USB ports and optional IEEE 1394 and 22-in-1 Media Card Reader (the latter two were not included in our Z400 model). The backside offers up more audio, another six USB ports, another optional IEEE 1394 port and Gigabit Ethernet, as well as legacy PS/2 ports.

Figure 4: A look at the Z400’s backside I/O (Source Jon Peddie Research)

The back also provides access to six slots, including two full x16 PCI Express slots, allowing for installation of two single-slot graphics cards. Or in our case, it allows for one high-performance dual-slot card: the Quadro 5000.

Benchmarking the Z200 and Quadro FX 1800

To assess performance, we employed the same basic tools we have in the past: SPEC Viewperf to stress the graphic subsystem, in this case the Quadro 5000, and SPECapc tests to get a handle on whole-system performance. This time around, we chose SPECapc Maya and SPECapc Lightwave, representing two popular applications used in digital content creation.

Benchmarking the Quadro FX 5000 with (for the first time) Viewperf 11

First up was Viewperf 11. It’s the first time we’ve used version 11, the most recent revision of the long-time graphics benchmark. And it was a pleasure. Gone is the multitude of confusing options on running the benchmark and in their place is a simple dialog box that allows you to select resolution and number of iterations. Simple. Thank you, SPEC.

The Quadro 5000 is based on Fermi, Nvidia’s ambitious new architecture covered in detail in the pages of JPR’s TechWatch and Workstation Report. But the 5000 doesn’t include the maximum number of processing cores the architecture touts. Where Fermi (for now) maxes out a theoretical 512 cores, the Quadro 6000 exploits the most, tapping 448 (similar to the GeForce GTX 470). Nvidia has likely chosen a reduced number to maximize yield and reliability for workstation applications.

Figure 5: An array of (up to) 512 CUDA cores forms the current foundation of Fermi (Source: Nvidia)

With its 352 processing cores and 2.5 GB memory, the Quadro 5000 isn’t intended to be Nvidia’s top performer for the Fermi generation, leaving that role for the 6000. And that begs the question: why did we benchmark the 5000 and not the big gun, the 6000? Well, because at initial launch, only the 4000 and 5000 are immediately available, with the 6000 expected to follow in September. So for the first two months at least, the 5000 is the biggest, baddest Quadro on the block.

Figure 6: Nvidia’s Fermi-based Quadro 5000 (Source: Nvidia)

But make no mistake, at a $2,249 MSRP, the Quadro 5000 is definitely supposed to be a performer, taking the place of the previous-generation Quadro FX 4800. And how well does the new Quadro 5000 fill the old 4800’s shoes? Quite well, it appears, based on the results of our Viewperf 11 benchmarking. The new model outperforms the old (roughly, based on a comparison to SPEC-submitted Viewperf 11 results for the FX 4800) for the majority of the viewsets by anywhere from 80% to 200%. 

Interestingly, in the minority of viewsets where the 5000 didn’t achieve a 80% (or better) speedup, we actually saw performance flat, or even decline slightly. We’d imagine the decline was an anomaly, attributable to one of either: a still-not-completely-tuned driver not taking advantage of all Fermi can do, viewsets that are throttled by I/O, or the possibility that the graphics hardware is so fast that the system can’t issue OpenGL calls fast enough to make them bottleneck. The results do tend to illustrate Fermi has the potential for big speed gains, provided the rest of the system can keep up. And that makes an appropriate segue to SPECapc.

To evaluate overall system performance, we had the opportunity to exercise two SPECapc tests, for Maya and Lightwave, two popular applications with digital content creators. Now with SPECapc, neither the graphics nor the rest of the system are solely responsible for scores. Rather, it’s the sum of the parts that are being tested, to try to give an idea of how the system might perform in a real-world environment.

The results for the Z400+Quadro 5000 were solid for both benchmarks, again beating numbers from similar (but not exactly the same) systems outfitted with the previous generation Quadro FX 4800. The margin by which our Quadro 5000 system exceeded FX 4800 systems for SPECapc tests varied, but — as one would expect —by margins far more modest than for Viewperf.

Figure 7: Viewperf 11 benchmark results for the Quadro 5000: 80 – 200% improvements in many – but not all – cases (Source: Jon Peddie Research)

 Figure 8: SPECapc for Maya 2009 results: HP Z400 + Quadro 5000 (Source: Jon Peddie Research)

Figure 9: SPECapc for Lightwave results: HP Z400 + Quadro 5000 (Source: Jon Peddie Research)

What do we think?

HP does little wrong when it comes to workstations. The company has over the past several years marshaled its forces and made the market a strategic battleground. It spends a lot of time and money researching the markets, applications, technologies and usage models, and that commitment has paid off, as the company has over that time come from a distant second to now virtually sharing the market with Dell. The Z400 is yet another example of a well-thought out machine that delivers solid performance at a very appealing price point.

By a similar token, Nvidia’s Fermi has delivered a solid performance boost for conventional 3D graphics. The company touts 8X the geometry performance of the previous generation, a speedup that manifests itself in Viewperf testing. But especially in the context of competitive offerings from rival ATI, new Quadro products built on Fermi are ultimately going to win business based on more than just graphics performance. With a holistic architecture delivering unique features like ECC and4X double-precision performance — and for the first time enabling a C++ programming environment for high-performance computing — Quadro can tackle a range of computational tasks well beyond rendering. And it’s precisely that proposition that should prove most appealing to prospective professional buyers: buy a high-performance graphics card and get supercomputer-caliber speedups for the many other challenging compute bottlenecks getting in the way of the day’s productivity.

Discuss this entry