If geeks love it, we’re on it

My dual core can beat up your quad core, and other CPU mysteries

My dual core can beat up your quad core, and other CPU mysteries

ARM A15The phone industry is hurtling towards a destiny positively brimming with sugarplum quad cores. Starting with NVIDIA Tegra 3, all and sundry are looking to a quartet of cores as the final step in the evolutionary ladder for this era of hardware. As with a great many things in this crazy market, however, the end of an era heralds something better and cheaper than what you purchased 2.6 days ago. Follow along as we explore the near future and highlight the one mobile technology that any self-respecting smartphone or tablet nerd should be waiting for. But first…

The Back Story

You may have heard of x86, but if you’ve not heard the word, the story goes thusly: Some 35 years ago, the masterminds at Intel Corporation had the rather brilliant idea that all microprocessors should respect the same rules. These were fundamental principles outlining the basics. For example, every car needs wheels; every car needs seatbelts; and every car needs headlights. Automobiles are built to federal regulations, but these regulations do not stipulate how the wheels, seatbelts and headlights appear—only that they work, and that they respect a few elementary requirements. Beyond this point, the car designer is free to realize her artistic vision. So it goes with the CPUs in our story.

This all occurred in 1978 with the release of the Intel 8086 processor, which contained the very first iteration of these basic rules. By now I must tell you that these rules are called an “Instruction Set Architecture,” or ISA. An ISA defines how a processor is expected to behave. Or how a car qualifies as road-worthy, despite the fact that some have 120 horsepower hybrid engines, and others have 500 horsepower V8s.

In the ensuing years, Intel released many CPUs that you probably know well: the 486, the Pentium and the Core i7. What they all have in common is that little kernel conceived in 1978: the 8086, or x86, ISA.

Yes, it’s true—after nearly four decades of extensions and revisions to the rules of the x86 ISA, every PC processor Intel has ever sold ultimately respects the ground rules written 35 years ago. And as companies like AMD, Cyrix and VIA came on to the scene, they licensed the x86 ISA for themselves and began to execute their artistic vision of the ideal x86 processor. In fact, every PC you’ve ever touched probably has an x86-based CPU. Even Apple Macintosh computers now use x86, after nearly 30 years of utilizing a competing and incompatible ISA known as PowerPC (PPC).

Now, all of this is not to say that the x86 of today is the x86 of 1978. Quite the contrary; we all know that transportation regulations get updated to keep up with the times, and so it has gone with x86. After nearly four decades, x86 has been iterated, extended and refined to become a monolithic set of rules that has enabled truly brilliant CPU architects to build truly brilliant CPU designs. Having a predictable set of rules also empowered software developers with the ability to write operating systems (say, Windows or Linux) that would work on any x86-based PC. In turn, software developers were empowered with a predictable playbook: standardized operating systems running on standardized hardware.

ARM LogoSurprisingly, however, Intel is not the subject of our story. Our subject is actually a little British company known as Advanced RISC Machines. Incorporated in 1990 as a joint venture between Acorn Computers, Apple Computers and VLSI Technology, Advanced RISC Machines (ARM) recognized the brilliance of Intel’s x86 strategy and pursued it in a different market.

Building a better toaster

While Intel was off chasing increasingly large and more powerful systems with the x86 ISA, ARM acquired an impressive string of companies to design processors that would instead power increasingly compact and efficient devices. The kind of processors that would (literally) power your toaster, your television, your mobile phone and your tablet. Those rules that ARM designed in their quest are unimaginatively known as the “ARM ISA”.

Aside from pursuing different markets, the x86 and ARM industries have a very different view on how a CPU should be produced. Whereas all of today’s x86 manufacturers—chiefly AMD and Intel—ostensibly design and manufacture their own chips, ARM merely maintains and updates the ARM ISA for sale to other companies. Companies can purchase a license to that ISA and bumble off to design a totally unique solution for a very specific application—like your smart toaster. There are a great many licensees, but the ranks of the notable are small: Qualcomm (Snapdragon CPU), Texas Instruments (OMAP), NVIDIA (Tegra), Samsung (Exynos) and Apple (Ax).

Furthermore, ARM occasionally goes one step further to dream up a complete CPU based on their latest ISA. This chip incorporates all the best technologies ARM and its partners have to offer, and packages them into a design that can only be produced with the latest manufacturing designs its licensees have to offer. Simply, ARM designs the pièce de résistance, and others can obtain it for whatever their bright little minds can imagine.

The better (but probably not any cheaper because R&D is expensive) thing

By now you’re aware that ARM is constantly refining the ISA it invented, and these days it has settled at version seven (ARM v7). That particular version of the ARM ISA has been responsible for a whole host of CPUs in products you are likely to know about: the Apple iPhone, the Sony PlayStation Vita and the Amazon Kindle Fire, just to name a few.

In fact, virtually every tablet and smartphone released since late 2010 has used the ARM v7 ISA. Furthermore, virtually every one of those products has used some interpretation of ARM’s latest CPU design: the ARM Cortex-A9, which (unsurprisingly) follows the rules set forth in the ARM v7 ISA.

The thing to know about the Cortex-A9 design, however, is that it’s getting a little long in the tooth. With four or five processing cores and clockspeeds hovering around 1.5GHz, processors like the NVIDIA Tegra 3 essentially represent the ultimate evolution of the A9 design. Something has to give, as neither core counts nor processor frequencies may be substantially increased.

But it’s not as though ARM did not know this, as it has been busy putting that ARM v7 ISA to work in a brand new design known as the ARM Cortex-A15 MPCore. Know this well, because by the end of 2012, you won’t even want to consider a product with a CPU that isn’t based on it, and you’re about to learn why.

The voodoo that Cortex-A15 do

In semiconductor design, there is a tremendously important mathematical formula and it goes a little something like this: P = C * V² * f. (Dear engineers: gross simplifications ahead.)

In the parlance of normals, we can use this equation to express that a CPU’s power consumption is equal to its capacitance (C), times its voltage squared (V²), times its frequency (f). In this case, the voltage would be the amount of juice required to keep the CPU running, and the frequency would be the CPU’s clockspeed, e.g. 1.5GHz.

Starting with the voltage, ARM Cortex-A15 designs result in a chip that is at least 20-28% smaller than Cortex-A9 designs. In other words, if you could hypothetically design and make the exact same CPU in A15 and A9 flavors, the A15 version would be about 30% smaller. Because the A15 is so much smaller, physics says it can get away with using a lower voltage. And because the voltage is squared in our above equation, I’m going to take you back to high school algebra and tell you that this leads to a quadratic reduction in power used. In short: much less power is used. Feeling good about your battery life, yet?

Cortex A15 MPCoreOr if you’re the kind of manufacturer that’s perfectly happy with the battery life of today’s devices, you can manufacture tomorrow’s A15 CPU to pack 30% more go-go juice into a processor that uses approximately the same voltage as today’s A9 designs.

But it’s not just about the voltage, because the Cortex-A15 design is a seriously efficient piece of kit. According to Texas Instruments VP Remi El-Ouazzane, a pair of A15 cores running at 800MHz offers comparable performance to a pair of today’s A9 cores running at 1500MHz. You read that right: 46% lower clockspeed, same performance. Going back to our formula, reducing the size of the number serving as the variable (f) by 46% is going to have another hugely positive impact on the amount of power our future smartphone CPU will draw.

Another benefit of this 46% increase in processor efficiency is more subtle, and perhaps a dirty little secret: most smartphone and tablet CPUs do not spend anywhere near the majority of their time at the clockspeed you read on the box. That shiny 1.5GHz CPU in your new smartphone probably spends most of its life right around 350MHz, because that’s all you need to flip around the desktop, read some webpages, or keep the engine running when your phone’s screen is off. And if you do happen to start a demanding task, modern mobile OSes can intelligently poke the clockspeed to a variety of values that fall between the minimum and the maximum. These operating systems are even smart enough to know just how low the clockspeed goes before you notice sluggish performance, and they work to keep you just above that point keep up appearances (and to conserve battery life).

Math is awesome

Because Cortex-A15 can match today’s A9 performance with 46% less clockspeed, this means these smart operating systems can intelligently keep the CPU correspondingly slower on a regular basis without compromising performance. Where today you might notice sluggish performance at 700MHz, tomorrow that threshold could be as low as 300MHz. Our equation is kicking ass!

Together, the lower mean clockspeed and flat voltage reduction provided by the Cortex-A15 design make for a chip that, second by second, operates at a substantially lower average voltage than today’s A9 designs. That’s a feedback loop even your mother could love.

But wait, there’s more! Whereas today’s A9 designs top out around 1.5GHz, A15-based chips are expected to debut at 2.0GHz and scale to more than 2.5GHz (with eight cores!) as manufacturers like Texas Instruments or Samsung refine the basics provided by ARM. In simpler terms, the performance ceiling of an A15 chip is markedly higher than an A9 design, and it achieves all of this with a smaller impact to your battery than anything on the market today.

The coolest secret ingredient ever made

Big.LittleCortex-A15 is compatible with another ARM technology known as big.LITTLE. Big.LITTLE allows for CPU designs that can dynamically disable some or all of the high-performance A15 cores in favor of a power-sipping core based on the ARM Cortex-A7 CPU design.

A7 and A15 share many of the same design principles and features, but the former was expressly designed to handle the low-demand tasks that don’t require a windup of the afterburners. You won’t notice the switch, called a “state transition,” but the A7 will tuck its big brothers into bed and get down to the business of web browsing, HD video decoding and music playing with absolutely exceptional efficiency.

But the state transition contains more sorcery than starting new, low-demand tasks on the A7 and shutting down the other cores as they go idle. Big.LITTLE maintains an open line of communication, or “coherency,” between the processor’s CPU cores, and can move a low-demand task from an A15 core to the A7 without skipping a beat. ARM says this state transition can occur as fast as 20 microseconds—so fast that not even your mobile OS has time to do much more than feebly acquiesce and act as if nothing happened.

Time to market

In my introduction I implied that ARM Cortex-A15 lies just around the bend, and by now you’re undoubtedly wondering when you can bring a slice of that action home. Texas Instruments and Samsung will be first to ship a fat wad of A15 CPUs in the second half of 2012, starting with the OMAP5 and Exynos 5×50 Series, respectively. The biggest, baddest phones of the back-to-school and Christmas seasons should be rocking one of them as a result.

Both OMAP5 and Exynos 5×50 feature a pair of Cortex-A15 cores clocked at 2.0GHz, backed by blistering-fast mobile GPUs expressly designed for high-detail, DirectX 11-style graphics on screens with a resolution of 720p or higher. In the case of OMAP5, it will take a cue from ARM’s big.LITTLE technology by spinning down the A15s and handing power-light tasks to a pair of efficient Cortex-M4 cores. It remains to be seen if Samsung will follow suit, but it is by no means a deal-breaker if Samsung does not.

In conclusion, if you were salivating for the upcoming crop of quad core phones, those in desperate need of an upgrade will know this summer’s enticing options as the GSMA Mobile World Congress and CTIA Wireless events unfold over the course of the next few months. However, if you’ve not yet staked a claim in that camp, products based on the remarkably superior Cortex-A15 design await you at the end of 2012.

Besides, as a smartphone enthusiast, I bet you’re no stranger to asking: “Eh, what’s a few more months?” It’s worth it this time.

Comments

  1. mertesn
    mertesn Excellent writeup, Space-Penguin. Can't wait to see what designs the manufacturers come up with.

    It almost appears as though Windows on ARM could become a viable desktop replacement in the near future. I'm sure Apple has their own plans to kick Intel to the curb ASAP, and I'd be willing to bet A15 is the pair of steel-toed boots they need to do the job.
  2. Tushon
    Tushon Agreed. Great write up and I'm really excited to see how this is exploited in the mobile space and in creating increasingly small footprint PCs for "normal" use.
  3. Mt_Goat
    Mt_Goat That was one of the best reads I have processed through my grey matter in quite a while. It was exciting, informative and left me thinking new possibilities! Well done SP!
  4. quake101
    quake101 Nice writeup! I really enjoyed reading this one. :)
  5. AlexDeGruven
    AlexDeGruven This is an excellently in-depth write-up on a technology arena in which I'm keenly interested.

    You've set your own bar quite high, sir. Can't wait for more.
  6. huxley
    huxley Good article with one historical quibble, the PowerPC family was first used in 1994 on Macs, Apple migrated to that from the Motorola's 680x0 processor series which had been used in Macs before that (1984-1994).
  7. Gargoyle
    Gargoyle Seeing that:

    • Performance in ARM chips is comparable to x86 in some use cases

    • Both Microsoft and Apple have OSes that run on ARM

    • Power consumption by ARM chips is much lower than x86

    It seems logical to move to ARM designs for at least mainstream laptops and possibly desktops. If Apple can get 5-7 hours of battery life on an Air with an Intel chip, imagine what they'd get with an ARM chip.

    I'd love to move past the age when we need 500W+ PSUs in performance desktops, too. We'd need a breakthrough in GPU power draw, though. Wonder how close we are?
  8. Thrax
    Thrax GPUs won't budge from their current power profiles, but what is happening that people don't notice is that the performance improvements offered by GPUs at every power envelope is increasing faster than the rate software requires. In other words, today a $100 GPU with 125W of draw can deliver the same performance that once required $400 of GPU and at least 225W.
  9. Vicar
    Vicar Brill article, much appriciated.

Howdy, Stranger!

You found the friendliest gaming & tech geeks around. Say hello!