If geeks love it, we’re on it

Quad Core Processing: Over-simplified, demystified, and explained

Quad Core Processing: Over-simplified, demystified, and explained

IBM struck first, making dual core processing a mass-market reality in the Power Mac G5 (with the PowerPC 970MP). They were soon followed by the release of Intel’s Pentium D (8xx series) and AMD’s Opteron (x60, x65 and x70), spreading dual core outside the Mac world. This time, a little over a year later, Intel is poised to strike first, preparing to release the Kentsfield, the first quad core desktop processor. We’re going to explain what quad core is, how it’s being implemented by the major chip manufacturers, what benefits it’ll bring, and, most importantly, why we need it.

What is quad core?

A processor is comprised of three things. The first is the core. The core is actually responsible for executing processing tasks. The core is fitted inside the die, which is the silver tab you may have seen on the top of a processor. All of this is fitted inside the package, which is the green material capped by the heatspreader on top, and laced with pins on the bottom. It is important to know that the number of cores can scale independently of the dies.

As an example, Athlon XPs had one die and one core. The first dual core chips had two dies with one core in each (see diagram). Today’s dual core chips have two processing cores fitted under one die. The quad core chips that will first replace them merge two dual core dies into a two-die, four-core package or one die with four cores. It is both an evolution and a continuation of a trend.

Here we can see the progress: One larger die is shrunk and placed with a twin on the same package. Then a single-die/dual core die is produced, and the cycle repeats ad infinitum with more cores per die.

Here we can see the progress: One larger die is shrunk and placed with a twin on the same package. Then a single-die/dual core die is produced, and the cycle repeats ad infinitum with more cores per die.

It is important, though, to know that quad core is a result. Particularly in computer engineering, the goal does not necessarily dictate the means. For example, NVIDIA and AMD/ATi both render DirectX 9 with their products, but they both approach it from different routes. Quad core is the same way; there are two primary methods by which to achieve the goal: the inelegant way and the native way.

How are Quad Core chips made?

The first implementation, the inelegant one, comes from placing two processor dies in the same space that one used to occupy. This procedure is made possible by improving the manufacturing process of the dies that are being placed together on the single processor. The advantage for the manufacturer is three-fold.

  1. They’re able to crow about being first to the market with the “wave of the future” processor.
  2. It furthers a chicken-and-the-egg philosophy (we’ll touch on this again later) by both creating a demand that did not previously exist and establishing a new direction for processor-intensive content.
  3. It allows manufacturers to produce multi-core chips inexpensively. Throwing away one dud die with two processing engines is better than throwing away one dud die with all four processing engines on the same slice of silicon.

The main disadvantage of this technique is that the heat produced will double when two dies are stuck in the same package. For example, the Core 2 Duo chips radiate about 65 watts of heat. The Kentsfield, which crams two Core 2 Duos together, radiates between 125 and 132 watts of radiant energy. That’s a lot! CPUs haven’t put out that much heat in a long time.

The other disadvantage, the one that matters to most of us, is the logic used to price the chips. If one Core 2 Duo costs about $400 USD, then one Kentsfield should cost about $800! The bleeding edge is an expensive place to buy in and a lucrative place to sell in.

The other way to make quad core chips was expressed in the diagram above, known as the native way. Native quad core designs incorporate four individual cores on one single processor die. The benefit for the manufacturer in this case has three parts.

  1. They are afforded the ability to state that they have the world’s first “true” quad core chip.
  2. These chips run much cooler than two die / four core solutions, since a single die will always radiate less heat than two in the same package. The manufacturer can then bundle less expensive coolers with their chips, and savings are then passed onto us as consumers.
  3. Perhaps the most important benefit is that a native quad core processor can have all four cores talk to one another at the speed of the processor itself, rather than running communication from cores 0 and 1 out across the motherboard to talk to cores 2 and 3.
native_vs_inelegant.jpg

In this scenario, we see that the cores on the "native" QC chip are capable of talking directly to one another. On the "inelegant" solution, while still quad core, it must endure the lag of sending information across the system bus if one half of the chip needs to talk to the other.

How substantial the lag is with the inelegant solution is hard to compare, because we’ll probably never see a native quad core design that is otherwise identical to its inelegant brother, save the number of dies. It just won’t happen; technology doesn’t progress like that. What matters in the end, however, is that it’s a quad core chip; there are four cores, they’re all real, and they all can bear a processor load.

Intel has opted to pursue the inelegant solution for now. As a result, it is almost eight months ahead of AMD in the race to go quad. Intel is also fixing to release a significantly cheaper version of the Kentsfield, the QC6600, when the novelty factor has worn off of the Extreme Edition. This means that Intel’s “dirty” and “inelegant” approach to quad core will be available to the masses at the price of today’s Core 2 Duos around March 2007. This will beat AMD by four to six months.

AMD’s approach, on the other hand, comes partly by necessity and partly by will. The mean, green underdog is going straight from native dual core to native quad core in time for the end of summer in 2007. AMD, at the time of writing, has just introduced their 65nm processors while Intel has been featuring them for a year or more. It makes sense that AMD would catch up by making their native quad core chip the crowning achievement of their move to 65nm. As it stands, AMD was just too far behind on 65nm to introduce an inelegant quad core chip without shooting their future plans for Barcelona in the foot.

How quad core works

Quad core requires the union of many elements to make it work well. The easiest stumbling block at this stage is hardware support. Motherboards and BIOS updates need the ability to recognize and run these chips. Intel already has many boards that support all three of their last major chips and AMD is going to do the same.

The next hurdle to overcome is the operating system, which must be capable of recognizing more than one core inside a chip. Thankfully, Microsoft and virtually all flavors of Linux distributor have fully endorsed the power of multi-core chips, allowing today’s versions of Windows and Linux to support chips with two, four, eight, or even sixty four cores.

The last step on the road to quad is the biggest linchpin to its success: software that runs on the chips.

Software is what makes or breaks multi-core chips. The reason lies in a concept called threading. Threading is how a program divides its workload across multiple cores or multiple physical processors. A thread is a stream of instructions related to a task the program is managing (like game physics, sound reproduction, or 3D rendering).

Relatively speaking, desktop programs that divide their labor, “multi-threaded applications”, are very new. In 20-odd years of mainstream desktops, only in the last three have we seen the shift towards multiple processors, multiple cores, and applications that recognize them. As the amount of cores increases, it gets harder for developers to divide the labor.

The frequent problem with threading is that while applications may be performing many tasks at once, each task relies on another to make it work. If a video games is producing 3D graphics, there are many other things happening (like sound, AI, and physics), but all of those are ultimately beholden to what the 3D renderer is doing.

Consider walking into a room with water dripping from the ceiling; there are three things occurring here.

  1. The computer must render the room and the water.
  2. The computer must figure out how the drop of water changes mid-flight, and how the puddle changes when the drop impacts it.
  3. On top of all of that, the computer must produce the sound effects of the water hitting the puddle.

Notice that each task is dependent on the task before it? Without the scene, there would be no physics to do, and with no physics to do there can be no sound effect. So how does a developer split it up?

With broad threading, the developer generally divides the workload between the two cores, and tells them to stay synchronized with the voodoo that they do. This is an inelegant solution as it is really only designed to fulfill the demands of two processors whether they are dual-core or physical processors.

Here we see the program as the red and blue lines entering the dual core processor. It's a broad-threaded mutli-threaded application, so the processor workload is simply split in half. Each core takes one.

Here we see the program as the red and blue lines entering the dual core processor. It's a broad-threaded mutli-threaded application, so the processor workload is simply split in half. Each

In a quad core world, this doesn’t work so well. There is more processor horsepower than broad threading can use, so all those unused clock cycles are wasted. Benchmarks around the Internet are showing the inherent weakness of broad threading as higher-clocked dual core processors surpass quad cores in performance despite a vastly greater potential under the hood of quad core machines.

With fine-grained threading, the developer carefully analyzes everything the program can possibly be doing at any one time–a task some regard as the equivalent of predicting chaos–and then writes the program so that each possible task knows part of the processor it’s going to in order to truly maximize the horsepower of the processor. The weakness of broad threaded apps is that cores in a processor can often go underutilized; there isn’t always a way to keep that second thread alive and kicking in a multi-threaded scenario. Fine-grained threading ensures that there’s always something to do for each of the cores.

dcore_fine_threading.jpg

A fine-grain multi-threaded process generates hundreds of individual tasks, each one directed specifically to an underutilized core to guarantee the full horsepower of the CPU is being used.

Unfortunately, developers have only recently become very good at fine-grained threading for dual core, and now quad core is on its way. Imagine the complexity of the processor above doubled!

Each half of the processor would be receiving two sets of red and blue arrows with a lattice of processes, but worse still, the programmer would have to know to code such a beast. This is both the biggest blessing and biggest curse of quad core: the potential for massive processor ability lies right in that square of silicon, but programming for it is an absolute work of art.

What benefits will Quad Core bring?

Right now, there are only a few applications that truly benefit from quad core computing. These applications are the sort that have always benefited from more processor horsepower, no matter how that was obtained. Rendering 3D graphics for still scenes, compression of DVDs into portable movie files, the compression of CDs into MP3s, and anything else that is just a massive chunk of mathematics. In the aforementioned cases, the processors can work in tandem to crunch large sets of numbers because they’re predictable, not terribly dynamic, and do what a processor does best: compute.

On the other hand, games (right now) will receive little to no benefit from a quad core processor because they only recently began to really harness the power of dual core offerings. The benefit of quad core is not what it does now, but what it will do.

When AMD and Intel released dual core processors, everyone wondered why we needed that much processing power. Games often failed to fulfill the maximum load of a single processor and game developers complained that it would be difficult for them to split what was happening in a game across multiple processors. We see today, though, that the industry has risen to meet the challenge. Developers and companies that once squabbled over the advancements that dual core would bring are now on the dual core bandwagon with powerhouse top-ten titles. We are all reaping the benefits of a technology that ushered in the change. Like the old chicken and the egg scenario we mentioned earlier, we needed the software to drive the preeminence of dual core, and we needed dual core chips to drive the software.

We have no doubt today that the quad core scenario is entirely similar. Perhaps one day we’ll reach the point where the benefit does not exceed the cost of adding additional cores, but looking at titles like Alan Wake and Valve Software’s experiments with fine-grained threading computing, we don’t think that the quad core chip’s cost exceeds its benefit. We think there is a bright future for quad core, and that the value of the chip should not be measured in what it does today, but what it will do for us in the middle of 2007.

There are four primary tasks that every game must perform and the processor must assist in doing these tasks.

  1. 3D rendering
  2. Physics
  3. Sound
  4. Computer AI

Right now, assisting the graphic card’s processor with 3D rendering and physics can take up more than one core each. Sound and AI can fill up the rest, pushing both of your cores to the limit. Imagine, though, if you could dedicate an independent core to physics, one to assisting the graphics card, a third core for physics/graphics spillover, and the last one for sound and AI? Developers would receive an unprecedented increase in versatility because, at last, a whole processor could in effect be donated to each process that used to be ram-rodded into one core. It’s like moving 40 gallons of water through four hoses at once, rather than waiting for all forty to filter down the same pipe. Clearly the four pipes are going to move it faster!

Why do I need quad core?

Quad core is the future, and it’s a future that’s coming faster than the dual core one did. While few developers reacted quickly to the introduction of dual core CPUs–perhaps they were unready for such a drastic change–developers have begun acting on quad core even before there was a chip to test it on. Valve Software (of Half-Life 2 fame) and Remedy Entertainment (of Alan Wake fame) both began work on optimizing quad core code before the Kentsfield had even finished production on the very first processor. This time, developers have had fair warning, and the good sense to read the writing on the wall.

If you are the owner of a socket AM2 motherboard or an LGA775 motherboard that’s Core 2 Duo compatible, you’re set for the introduction of quad core. With the QC6600 from Intel and the Barcelona chips from AMD, you’ll be able to drop a brand new chip in your motherboard and double your processing capacity.

We need quad core to drive the future and to give the future a platform to come to. While it may mean you have some idle cores on your processor for the time being, you’ll have more idle strength in your processor than anyone with an average desktop computer has ever enjoyed. Quad is the power of tomorrow.

Comments

Howdy, Stranger!

You found the friendliest gaming & tech geeks around. Say hello!