Since the launch of the Core 2 Duo line of processors in mid-2006, new workstations have been more about evolution than revolution, with solid incremental but uninspiring performance gains. This is no longer. Sporting a completely redesigned case and Intel’s new Nehalem processor the new Z800 knocks the socks off HP’s existing workstation line—especially for video editors and streaming producers. With hyper-threading technology enabled on the HP Z800, you get 16 cores on a dual-processor, quad-core Intel Nehalem system. Rhozet Carbon Coder got them all working, too.
Is “knocks the socks off” a bit too vague for you? Try this: In a real-world trial, I compressed a 90-minute HD-recorded ballet performance to DVD format. The previous performance champ in my office—a 3.3GHz dual-processor, quad-core system (64-bit Windows Vista, 16GB of RAM)—finished in two hours and two minutes. The Z800, equipped with two 3.2GHz quad-core Nehalem processors (64-bit Vista, 18GB of RAM), finished in 59 minutes and 30 seconds—more than 50 percent faster.
Before you jump over to hp.com to place your order, you should know that your mileage will definitely vary, not only by application but even by project type within the same application. To understand why, we need to take a quick look at Nehalem.
Chances are that you don’t lie awake at night thinking about how to make faster CPUs, but if you did, you would probably figure out that there are two main approaches: You can make the CPU itself faster, which is a rising tide that lifts all boats in the harbor. Or you can improve the data flow to the processor, which primarily speeds applications that were previously bottlenecked because the CPU sat idle while waiting for data to arrive for processing.
For example, a 100K data set in a 3D design program such as Autodesk 3ds Max can necessitate hours of rendering time. However, the problem there isn’t getting the data to the CPU, so enhancing the bandwidth to and from the CPU will produce only a marginal benefit. On the other hand, in a video-editing application that throws gigabytes per second at the CPU, improving data throughput can deliver profound speed improvements.
On what did Intel focus with Nehalem? Intel primarily focused on the latter category, for which the new chip delivers several new innovations to the Intel processor line, most notably an integrated memory controller (IMC) and QuickPath Interconnect (QPI). If you follow CPUs closely, you’ll remember that AMD debuted an IMC several generations ago. Intel has improved on AMD’s 2-channel design with a triple-channel memory controller that should deliver significantly higher bandwidth to and from the CPU than AMD does. The other major bandwidth-related innovation, QPI, is a much faster replacement for the front-side bus that transmits data directly from CPU to CPU and from CPU to the chipset that integrates the CPU with other system components.
In regards to making the CPU itself faster, Intel did add one familiar technology back to Nehalem: Hyper-threading Technology (HTT), which adds components of a second processor to each Nehalem core. When you open the Performance tab of Windows Task Manager, you’ll see 16 CPUs running, which is an impressive sight.
While more sounds better, HTT helps only when applications are efficiently multithreaded. When it doesn’t help, it usually hurts. For example, in the ballet benchmark, the Z800 finished in 59 minutes and 30 seconds with HT enabled. Without HTT, performance slowed to 71 minutes. That’s 19 percent faster with HTT enabled. On the other hand, the Z800 ran through my 48-file encoding benchmark with Rhozet Carbon Coder in 36 minutes and 22 seconds with HTT enabled and in 26 minutes and 53 seconds with HTT disabled. That’s 26 percent slower with HTT enabled.
Why the difference? Because it’s harder for the operating system to manage 16 cores than eight cores. And when processor utilization is poor, the extra overhead degrades performance. Though Rhozet is a fantastically multithreaded encoder, apparently the VC-1 codec isn’t (at least when it’s working within Carbon Coder), and simultaneously encoding 16 files to VC-1 produced an overall CPU utilization of around 18 percent. This jumped to 50 percent with HTT disabled, reducing encoding time by 33 percent.
Another enhancement to the CPU’s pure processing capabilities is Turbo Boost, which over clocks the processor when it’s operating below power, current, and temperature specifications—which can increase performance by about 10 percent. All Nehalem-based systems will benefit from Intel’s Turbo Boost feature, while HP added an additional level of turbo-boost to their own line of Nehalem-based workstations.