Supplied by AMD
AMD’s Socket AM2 – The Next K8
In 2003, AMD announced the launch of their K8 ‘Hammer’ architecture that is present in their award winning Opteron and Athlon 64 processors. Hammer was a big step in new directions that really set the Athlon 64 apart from its predecessors and competitors. The 64 bit instruction set known as ‘x86-64′ was perhaps one of its most unique features but there were many other enhancements behind the scenes that have proven to make the Athlon 64 a revolutionary processor. The introduction of the on-die memory controller has provided ultra-low memory latency and the HyperTransport bus is arguably the most flexible, best performing data bus in production. There have been quite a few changes to the K8 architecture over the last three years, but the underlying technology has remained fairly similar. One aspect of the K8 that has remained constant over the last three years is the support for varying flavors of DDR memory. Today, it is becoming evident that the hardware industry as a whole will be abandoning traditional DDR memory in the not too distant future. AMD’s latest revision to the K8 architecture incorporates a new DDR2 memory controller, a new socket design and some other pleasant surprises. Short-Media will be turning another page in the K8′s history, and will be taking an in depth look at AMD’s new Socket AM2 platform.
So what exactly do the ‘Revision F’ AM2 processors bring to the table in a nutshell?
- DDR2 support (up to 667MHz all chips) / (up to 800MHz for X2 and FX)
- Reduced power consumption including a new ‘Energy Efficient’ model lineup
- ‘AMD Virtualization’ support (hardware virtualization in-chip)
- Redesigned 4-bolt heatsink tray
When DDR2 SDRAM was released a few years ago, it was received with some very mixed reactions. It certainly looks good on paper, with operating frequencies over double that of the fastest JDEC specified traditional DDR. This huge boost in memory frequency was made possible by many design improvements including a better electrical interface and termination as well as some clever I/O buffer enhancements. Rather than trying to ramp up the memory core frequency only, the I/O frequency was doubled instead. This higher I/O frequency combined with a prefetch of twice the number of bits allows for much greater bandwidth without the need for the entire memory core to operate at a higher frequency. As an added benefit, DDR2 also consumes much less electricity than its DDR predecessor—operating at a mere 1.8V compared to the 2.5V of traditional DDR. Unfortunately, there is a downside to all of the great DDR2 enhancements, and that comes in the form of higher latency. Generally speaking, DDR2 latencies are double that of traditional DDR.
Overall memory performance is generally determined by two variables: bandwidth and latency. If the memory is operating at twice the speed, but doing half the amount of work per cycle (due to increased latency)—is there really a performance benefit? We’ll be taking a close look at AM2 DDR2 performance in later sections.
There is an excellent technical overview of DDR2 technologies available at ‘LostCircuits’ if you’d like to learn more. You can find it at the following URL: http://www.lostcircuits.com/memory/ddrii/
Reduced Power Consumption
In the drive for cooler, quieter and cheaper computing, low power consumption is of utmost importance. AMD has significantly reduced the power consumption of their entire AM2 lineup compared to their predecessors. AMD’s maximum power consumption for dual core processors has dropped from around 110 watts to 89 watts TDP. Single core processors have seen a similar decrease from about 89 down to 67 watts. I can confirm that the core voltage for the X2 5000+ is only 1.3V down from 1.35V in previous X2 models (more on this to come). Overall system power consumption should also be lowered quite considerably due to the use of DDR2 memory. Estimates put DDR2 at about 20-30% more power efficient than traditional DDR.
AMD has also announced the release of an entire lineup of desktop based ‘Energy Efficient’ processors, which consume even less power—some models as low as 35W. These processors perform identically to their standard counterparts but are designed to operate at lower voltages. This is the first time a consumer will actually get to make this type of choice in the desktop market. AMD’s 90nm lineup was already incredibly energy efficient and this move really reinforces AMD’s commitment to the environment and to reducing operating costs for customers. Obviously there will be a price premium to pay for these efficient processors, but again, the consumer can now make the choice.
Being a performance and overclocking enthusiast, I can’t help but ponder if these efficient processors will be the next ’2500+ Mobile Bartons’ as far as overclocking headroom is concerned.
|Model||TDP (Energy Efficient Version)||TDP (Standard Version)|
|Athlon 64 X2 4800+||65 Watts||89 Watts|
|Athlon 64 X2 4600+||65 Watts||89 Watts|
|Athlon 64 X2 4400+||65 Watts||89 Watts|
|Athlon 64 X2 4200+||65 Watts||89 Watts|
|Athlon 64 X2 4000+||65 Watts||89 Watts|
|Athlon 64 X2 3800+||65 and 35 Watts (two models)||65 Watts|
|Athlon 64 3500+||35 Watts||65 Watts|
|Sempron 3400+||35 Watts||65 Watts|
|Sempron 3200+||35 Watts||65 Watts|
|Sempron 3000+||35 Watts||65 Watts|
Special Note: The third digit of a desktop processor’s OPN details the maximum power consumption of that model. “A” denotes “normal” power, “O” denotes a 65 watt Energy Efficient processor and “D” denotes a 35 watt Energy Efficient processor.
I’d also like to give kudos to AMD for their very honest maximum power consumption rating system. The TDP or ‘Thermal Design Power’ is very clearly defined and the power dissipation that the average consumer will see should be well below these ratings. AMD rates their TDP value with the processor under the heaviest possible load conditions, in the very worst thermal conditions (at the maximum rated casing temperature—’tcase max’).
AMD Virtualization Technology
AMD’s Virtualization technology has been in discussion since about 2004 and was originally code named ‘Pacifica’. It is now available in their entire AM2 line of processors. With the recent boom in the virtualization market, products like VMWare, Virtual Server and Xen are becoming more and more popular. The most complicated component of a virtualization system is the VMM (Virtual Machine Manager) which provides hardware emulation and allows multiple operating systems to access hardware components without interfering with each other. As a result, VMMs are generally very complex and demanding on system resources. From a hardware perspective, the CPU made no distinction between the VMM and any other piece of software running. AMD’s virtualization technology incorporates a special privileged mode that the VMM can run in, allowing it to run at a lower hardware level than the operating system and all applications. This separation eliminates a great deal of the emulation complexities involved in running virtual machines, and can greatly improve performance as a result. All of the major players in the virtualization market have indicated that they will be supporting the ‘AMD Virtualization’ feature set.
Although this may not seem like a very exciting feature to the average home user, developers and some businesses will most certainly be trying to take advantage of this special feature set.
New Socket Design
Socket 939 lasted a surprisingly long time (for a socket anyhow) and even dual-core support was implemented without any physical changes to the platform. With the added complexities of running DDR2, it seems that Socket 939 could simply not accommodate. Since DDR2 is not backwards compatible with DDR, it seems fitting to start fresh with a new socket platform.
The physical differences between Socket AM2 and 939 are not very obvious. Only one pin has been added for a total of 940 pins. AM2 should not be confused with Socket 940 which is home to many Opteron processors—the chips are not interchangeable. The socket is ‘keyed’ to prevent the insertion of non-AM2 processors. AM2 uses a standard ZIF lever system, and looks very similar to that of 939.
The biggest physical difference between AM2 and 939/754 is the heatsink retention frame. Rather than being a ‘box’ that the heatsink sits within, it is fairly flat at the top and bottom, allowing the use of extra large heatsinks with wide bases.
One bitter-sweet physical change is the implementation of four base screws (two per side) as opposed to only one per side on 754/939. On a positive note, having four screws allows for greater heatsink stability on the board and greatly improved mounting uniformity for aftermarket screw-through heatsinks. On a negative note, all existing 754/939 screw-through heatsinks are not compatible with the new socket design. If you take a close look at the protruding tab at the center of the frame, you’ll notice that it is in the exact position as in previous designs. This tab allows all 754/939 heatsinks utilizing the locking lever system to operate with the new socket.
As you can see above, the popular dual-core ‘heat-pipe’ heatsink common to 939 and AM2 processors fits like a charm. I really love the locking lever system. I can’t tell you how many times I nearly punctured my mainboard with a flat-head screwdriver trying to mount heatsinks on the old Socket-A boards.
Before we get into any model specific details, here are some common specifications to all AM2 processors:
|Manufactured:||Fab 30 in Dresden, Germany|
|Process Technology:||90-nanometer DSL SOI (silicon-on-insulator) technology|
|Packaging:||Socket AM2 (940-pin organic micro PGA)|
|HyperTransport technology:||Supports single HT link – up to 8.0 GB/sec per link bandwidth|
|Memory:||DDR2 memory controller|
|Effective data bandwidth:||Up to 12.8 GB/sec dual channel memory bandwidth|
|Total CPU bandwidth: (HyperTransport + Memory bandwidth)||Up to 20.8 GB/sec|
|Memory Speed – FX and X2:||DDR 2 memory up to and including PC2 6400 (DDR2-800) unbuffered|
|Memory Speed – A64 and Sempron:||DDR 2 memory up to and including PC2 5300 (DDR2-667) unbuffered|
|Common Features Added:||AMD Virtualization Technology|
|Availability (Timing):||Expected from leading OEMs and system builders worldwide in May 2006|
There is really nothing too surprising with respect to process technology and HyperTransport. Although AMD has made some great progress with their 65nm manufacturing process, AM2 parts will remain in 90nm flavors (at least for now). One thing I did find rather peculiar is the statement that only the X2 and FX series processors will support DDR2-800 (PC6400). There really should not be any technical difference or limitation between single core and dual core processors to warrant this statement. I do not have a single core processor available to verify this claim, but it could very well be a simple matter of ‘official’ support—i.e. it should still work. There is a fairly significant performance difference between DDR2-667 and DDR2-800, so it will be interesting to see how this pans out. This fact alone will make the dual core product lineup much more appealing to consumers.
Aside from AM2 versions of existing 939/754 models, there will be two brand new models with higher PR ratings than current 939 offerings: the FX62 and X2 5000+. There is also a brand new ‘Athlon 64 X2 4000+’ model available that features a 2.0GHz core frequency and 1MB of L2 cache per core. I believe this model will be popular to overclockers and performance enthusiasts, as this will likely be the next ‘Opteron 170′.
To my surprise, AMD has moved the entire Sempron processor lineup straight to AM2. There was some speculation that 939 may become the home of the Sempron for a while. This would be in line with what happened to Socket 754 not long ago when the Athlon 64 moved to socket 939. With the industry pressure to phase out traditional DDR, it clearly made the most sense to move the entire lineup to ensure that the Semprons could operate with DDR2. Not only does this make processor and mainboard selection much easier, but it ensures a common memory type across all desktop product families.
Here are the specifications for the two newest (and highest performing) additions to the AMD family:
|Feature||Athlon 64 FX-62||Athlon 64 X2 5000+|
|L2 Cache Per Core||1024KB||512KB|
|L1 Cache Sizes Per Core||64K instruction + 64K data||64K instruction + 64K data|
|CPU to Memory Controller Frequency||Same as CPU core frequency||Same as CPU core frequency|
|Memory Controller||Shared integrated 128-bit||Shared integrated 128-bit|
|DDR2 Memory Supported||Up to and including PC2 6400 (800MHz) DDR-2 memory||Up to and including PC2 6400 (800MHz) DDR-2 memory|
|HyperTransport Spec||2GHz (2x 1000MHz / DDR)||2GHz (2x 1000MHz / DDR)|
|Packaging||Socket AM2 organic micro-PGA||Socket AM2 organic micro-PGA|
|Process Technology||90nm (.09-micron) Silicon on Insulator (SOI)||90nm (.09-micron) Silicon on Insulator (SOI)|
|Approximate Transistor count||227.4 million||153.8 million|
|Approximate Die Size||230mm 2||183mm 2|
|Max Thermal Power (TDP)||125 W||89 W|
|Max Case Temp||55-63 degrees Celsius||55-70 degrees Celsius|
With two 2.8GHz cores and 2x1MB L2 cache, the FX-62 is clearly the fastest processor in AMD’s lineup. The FX-62 is the only chip still set to operate at 1.35V and a maximum TDP of 125W. AMD decided not to release an X2 model with similar cache/frequency specifications to the 939 FX-60 (X2 5200+). Instead, they opted to release a 2x512KB cache 2.6GHz part dubbed the 5000+.
Below is the complete listing of AMD’s AM2 products and their pricing. AMD indicates that all AM2 models should be available as of the May 23rd launch date.
|Model||Frequency (per core)||L2 Cache (per core)||MSRP (USD$)|
|Athlon 64 FX-62||2800MHz||1024KB||$1,031|
|Athlon 64 X2 5000+||2600MHz||512KB||$696|
|Athlon 64 X2 4800+ (65W Energy Efficient)||2400MHz||1024KB||$671|
|Athlon 64 X2 4800+||2400MHz||1024KB||$645|
|Athlon 64 X2 4600+ (65W Energy Efficient)||2400MHz||512KB||$601|
|Athlon 64 X2 4600+||2400MHz||512KB||$558|
|Athlon 64 X2 4400+ (65W Energy Efficient)||2200MHz||1024KB||$514|
|Athlon 6 4 X2 4400+||2200MHz||1024KB||$470|
|Athlon 64 X2 4200+ (65W Energy Efficient)||2200MHz||512KB||$417|
|Athlon 64 X2 4200+||2200MHz||512KB||$365|
|Athlon 64 X2 4000+ (65W Energy Efficient)||2000MHz||1024KB||$353|
|Athlon 64 X2 4000+||2000MHz||1024KB||$328|
|Athlon 64 X2 3800+ (35W Energy Efficient)||2000MHz||512KB||$364|
|Athlon 64 X2 3800+ (65W Energy Efficient)||2000MHz||512KB||$323|
|Athlon 64 X2 3800+||2000MHz||512KB||$303|
|Athlon 64 3800+||2400MHz||512KB||$290|
|Athlon 64 3500+ (35W Energy Efficient)||2200MHz||512KB||$231|
|Athlon 64 3500+||2200MHz||512KB||$189|
|Athlon 64 3200+||2000MHz||512KB||$138|
|Sempron 3400+ (35W Energy Efficient)||2000MHz||128KB||$145|
|Sempron 3200+ (35W Energy Efficient)||1800MHz||256KB||$119|
|Sempron 3000+ (35W Energy Efficient)||1800MHz||128KB||$101|
As you can see above, AMD now offers a whopping 28 processor models in their desktop lineup alone. This is not even including server/workstation (Opteron) and their mobile offerings.
AMD Athlon 64 X2 5000+
AMD was kind enough to send us their second in command, the X2 5000+ for review. Operating at an impressive 2.6GHz per core, this processor should be no slouch.
After about five failed attempts, I gave up trying to count the pins present on the 5000+. I have decided to take AMD’s word for it and report that it has 940 of them. I did notice that the small ‘keyed’ areas have shifted to prevent the use of a 939/940 processor in an AM2 socket. Just incase anyone is wondering, the one extra pin is located at the bottom left corner of the CPU. Aside from that, there are few physical differences between 939 and AM2. The OPN and batch number format appears to remain similar. The 2x512K cache models appear to have the two letter ‘CU’ core type suffix.
Along with the X2 5000+, we received what appears to be a standard issue AVC HSF (Part # Z7U7414002). From what I can tell, this model is exactly the same as the one included with higher end X2 socket 939 processors.
The OEM heatsinks have come a very long way from the days of the K7. These heat-pipe models are top-notch. Overclocking enthusiasts have recently discovered that this particular model really works wonders with a higher flow fan attached. It features densely packed aluminum fins, an 80mm fan, a copper base, and heat-pipes for better thermal dissipation.
AMD has also provided us with some high end Corsair DDR2. This particular dual channel kit is Corsair XMS2-8500. Rated for 1066MHz (PC8500) operation at 5-5-5-15 timings, this is some high speed stuff. Although this is not DDR2-800 memory, it is totally backwards compatible with DDR2-800/DDR2-667.
The Board: Asus M2N32-SLI Deluxe
A CPU and a pair of DDR2 sticks are not much good without something to put them in. With that being said, our AM2 review package also included the latest mainboard from ASUS: The M2N32-SLI Deluxe.
At first glance, this mainboard looks like a real beast. ASUS has clearly geared this board towards the performance enthusiast. Although there is a lot to say about this new NVIDIA nForce 590-SLI based mainboard, I’m going to try to stay on the topic of AM2 specifically and keep our analysis of the M2N32-SLI short. Short-Media will be producing a review on the latest nForce chipsets in the not too distant future, and more detail will be uncovered. Here are some specifications:
|CPU||Socket AM2 for Athlon 64, Athlon 64 X2, Athlon 64 FX and Sempron|
|System Bus||2000/1600 MT/s|
|Storage||NVIDIA nForce 590 Support for:
Sil3132 SATA Controller Supports:
|LAN||NVIDIA nForce 590 MCP supported dual gigabit MAC with Marvell PHY.|
Based on the impressive feature set, I’m sure we’ll be seeing a lot of M2N32-SLI boards finding their way into PCs shortly after launch.
As can be seen above, the M2N32-SLI employs an 8-phase power regulation system. This mainboard was clearly designed with performance enthusiasts in mind. Core voltage stability was excellent during heavy load. The copper cooling fins do get very toasty during extended periods of use but seem to cool well enough. ASUS has an optional leaf-blower type fan that can be snapped on to the fins for enhanced cooling. This fan would be recommended if you have a water cooling setup, or a passive CPU cooler.
It would have been nice to see a little more clearance between the cooling system and the first PCI Express x16 slot. The 4-pin CPU power connector is also in a fairly inconvenient location on the board. I’m sure this location was likely chosen to keep it as close as possible to the complex 8-phase regulation circuitry, and is definitely forgivable for this reason.
I was pleased to see that the M2N32-SLI only required the 24-pin and 4-pin connectors on board. I must be getting too used to my DFI NF4 with four different power connectors. Small low-profile copper blocks are found on the north and south bridges. These blocks completely rely on the heat pipes to dissipate the heat away.
The nForce 590 MCP is certainly no slouch when it comes to SATA. One IDE channel was dropped in favor of two additional SATA2 channels. It is only a matter of time until we begin to see IDE phased out completely. Aside from the usual RAID 0/1 support, the 590 MCP also supports RAID 5.
I was surprised to see a single SATA2 connector at the top right corner of the board. The Silicon Image chip pictured above controls both the single internal port and an external port found on the IO plate.
To my surprise, a legacy serial connector was chosen over a parallel port on the rear IO plate. It appears that external SATA is beginning to become more common place as well. Both coaxial and optical S/PDIF outputs are present on the IO plate.
The DIMM slots are arranged similarly to the DFI NF4 series boards. The large passive cooling system and extra regulation circuitry probably made this positioning a necessity. Stay tuned for some more NVIDIA 500 series chipset information in a future review.
M2N32-SLI: The BIOS
One can learn a lot about a new platform simply by tinkering around in the BIOS. Naturally, this was my first stop upon building the system.
As you can see above, the M2N32-SLI employs the familiar Phoenix AwardBIOS. Most of the general menus look very familiar, and remind me of my old A7N8X-E BIOS.
Upon first boot, the CPU was correctly detected as the 5000+. Under the CPU configuration menu lies the Cool’N'Quiet options and the DRAM configuration.
This revision of the M2N32-SLI BIOS included all of the essential memory timing options including Tcl, Trcd, Trp, Tras, Trc, Trwt and command timing. The DDR2 base frequency can also be set here to DDR2-400/533/667/800. I noticed there were some ‘RSVD’ values that may be for future DDR2-1066 support. Out of curiosity, I tried that option and was dumped to an unknown speed that was slower than DDR2-800. According to AMD, all X2 and FX based processors support up to DDR2-800 and the Athlon 64 and Sempron processors support up to DDR2-667. As mentioned earlier, I am uncertain if any there is anything in place to enforce that specification.
The JumperFree configuration allows the modification of the basic frequency clocks, CPU multiplier and DRAM/CPU voltage. There is an ‘AI Tuning’ menu that allows some automatic overclocking functions (not my cup of tea personally but good for beginners none the less). There was some speculation that the new AM2 models would use a ’333MHz’ reference clock, with odd CPU frequencies like 2.33GHz etc. This idea was dumped in favor of the traditional 200MHz reference clock. Separating CPU clocks by only 200MHz allows AMD to have much more choice in their product lineup, and allows them to continue with their existing performance naming scheme.
This was one of the first menus I visited. Unfortunately, the AM2 processors are multiplier locked in the upwards direction, similar to all non-FX series K8 processors. Not a surprise, but it was worth a shot.
The reference clock frequency can be increased all the way to 400MHz from the 200MHz default on the M2N32-SLI.
I was disappointed to see a measly 1.36V selectable in the BIOS for CPU voltage. According the M2N32-SLI owners manual, up to 1.5V should be in that particular menu. Based on some of the other ‘RSVD’ values I saw, I would assume that this is a non-final BIOS revision. I’m sure the final release will add to this rather short list of Vcore selections.
DDR2 voltage control provides a good selection of values. 2.5V should be plenty for just about any type of DDR2 (The default DDR2 voltage is 1.8V).
There are a wide variety of other voltage adjustments for the mainboard components (north/south bridge, DDR2 termination voltage etc). I was not sure what P_+1.3V was. The manual did not provide any clear information. I thought it may boost the CPU Vcore (similar to the DFI ‘Special VID Control’ option) but I did not want to take any chances on this shiny new hardware.
All of the HTT multiplication and ‘Spread Spectrum’ options can be found under the ‘Chipset’ section.
AM2 Testing Configuration
- Lian-Li PC65B Case
- PC Power and Cooling 510 Express
- Asus M2N32-SLI Deluxe Mainboard (BIOS Revision 1017)
- 2x 512MB Corsair CM2X512-8500 DDR2
- AMD Athlon 64 X2 5000+ (Socket AM2, Revision F)
- AMD Retail Heat Pipe Heatsink and Fan
- ATI Radeon X850XT 256MB
- Seagate 300GB 7200.7 SATA Hard Drive
- Pioneer DVR108 DVD-RW IDE Drive
- Logitech MX510 Optical USB Mouse
- Logitech G15 USB Keyboard
- Windows XP Professional with SP2
- All Critical Windows updates as of 5/15/2006
- ATI Catalyst 6.4 with .NET framework and Catalyst Control Center
- NVIDIA Forceware MCP55 Chipset Drivers (including: Ethernet Driver MCP55 (v55.21), Network Management Tools MCP55 (v55.21), SMBus Driver (v4.52) “WHQL”, Installer (v5.05), WinXP IDE SataRAID Driver (v6.54), WinXP IDE SataIDE Driver (v6.54), WinXP RAIDTOOL Application (v6.54), SMU Driver (v1.12) — Firewall and Network Access Manager not installed)
- ADS Audio Drivers (Included with mainboard) <
939 Testing Configuration
- Lian-Li PC65B Case
- PC Power and Cooling 510 Express
DFI LanParty NF4 Ultra-D (BIOS Revision 11/14)2x 512MB OCZ Platinum Rev.2 PC3200 DDR
- AMD Athlon 64 X2 Processors (4200+/4400+)
- Thermalright XP90 Aluminum Heatsink and Retail Fan
- ATI Radeon X850XT 256MB
- Seagate 300GB 7200.7 SATA Hard Drive
- Pioneer DVR108 DVD-RW IDE Drive
- Logitech MX510 Optical USB Mouse
- Logitech G15 USB Keyboard
- Windows XP Professional with SP2
- All Critical Windows updates as of 5/15/2006
- ATI Catalyst 6.4 with .NET framework and Catalyst Control Center
- NVIDIA Forceware 6.7 Chipset Drivers for NF4 (including ‘IDE Drivers’, excluding ‘Network Access Manager’ and ‘NVIDIA Firewall’)
Windows XP performance tweaks common to both testing platforms
- ‘System Restore’ disabled
- ‘Indexing Service’ disabled
- ‘Security Center’ disabled
- ‘Windows Firewall’ disabled
- ‘UPNP SSD Discovery’ disabled
- ‘Remote Registry’ disabled
- ‘Help and Support’ disabled
- ‘Fast User Switching’ disabled
- ‘Classic Theme’ used
- No desktop wallpaper used
- No 3rd party peripheral software (USB Mouse/Keyboard) used
- Disk Defragmentation done prior to beginning testing of each platform (after all software was installed).
The First Boot
After setting boot priority I began the very familiar and dull task of installing Windows XP. I did not run into any surprises during installation. Driver installation was straightforward, and very similar to what I was used to with my older NF4/NF3 boards. In a purely subjective sense, the system had a very fast ‘feel’ to it and this was probably the fastest Windows XP installation that I have ever experienced.
Clearly the first task at hand is to install and run CPU-Z.
Looks like the folks at cupid.com also received some AM2 testing chips, because CPU-Z 1.33.1 had no issues identifying the 5000+. I had set the DDR to DDR2-800 in the BIOS, with 4-4-4-12 timings. These were the timings recommended by AMD for DDR2-800 use. To my surprise, CPU-Z was only reporting 373MHz (DDR2-742) for the memory frequency–not the expected 400MHz.
IMC Quirks and DDR2-742?
The integrated memory controller (IMC) on the K8 is definitely one of its greatest features. Since the controller is literally a part of the CPU die, it operates at the exact same frequency as the processor core. This very high speed operation coupled with direct access to the main memory contributes to the low memory latencies experienced on the K8 platforms. Obviously standard DDR and DDR2 memory can not operate at a 1:1 ratio to the memory controller. If that were possible, it would require 5,600MHz DDR for AMD’s top models. In order to keep the memory operating at a more reasonable frequency, the memory operates at a set fraction of the CPU’s frequency. The memory controller uses a ‘whole number’ divisor to achieve this. Take a 939 processor for example—the FX60. Operating at 2600MHz, its default memory frequency is 200MHz (PC3200 or DDR400). To obtain a 200MHz memory frequency, the ’13′ divisor or ‘CPU/13′ is used. This works very cleanly for all PC3200 based A64s on both the 754 and 939 platforms. Below are a few examples.
Default memory division for socket 754/939 processors:
CPU Frequency Memory Divisor (For DDR400) Resulting Memory Frequency 2800MHz CPU/14 200MHz 2600MHz CPU/13 200MHz 2400MHz CPU/12 200MHz 2200MHz CPU/11 200MHz 2000MHz CPU/10 200MHz 1800MHz CPU/9 200MHz
Since the standard memory frequency for AM2 processors is now 400MHz or 333MHz (DDR2-800 and DDR2-667), the memory controller is still forced to use a whole number divisor of the CPU frequency to obtain these frequencies.
Unfortunately, there are not always whole number divisors that can give you the expected 400MHz or 333MHz. Take the 5000+ for example. If you plug some numbers in your calculator, you’ll notice that CPU/6.5 would be required to obtain a clean 400MHz memory frequency. Since this is not possible, the memory controller will automatically round the divisor up to a value of 7. So 2600/7 gives us a memory clock speed of about 371MHz. To make matters even more confusing, there are several AM2 models that can run their memory at a clean 400MHz, like the 4800+. The ‘CPU/6′ divisor makes it possible for the 4800+ to keep it’s DDR2 running at the proper 400MHz frequency. Depending on the model you choose, your memory frequency can vary. Below is a table that outlines the actual memory frequencies you can expect:
Default memory division for socket AM2 processors:
CPU Frequency Memory Divisor (For DDR2-800) Resulting Memory Frequency 2800MHz CPU/7 400MHz 2600MHz CPU/7 371MHz 2400MHz CPU/6 400MHz 2200MHz CPU/6 367MHz 2000MHz CPU/5 400MHz CPU Frequency Memory Divisor (For DDR2-667) Resulting Memory Frequency 2800MHz CPU/9 311MHz 2600MHz CPU/8 325MHz 2400MHz CPU/8 300MHz 2200MHz CPU/7 314MHz 2000MHz CPU/6 333MHz 1800MHz CPU/6 300MHz
As you can see above, the default memory frequencies can vary from processor to processor. Only the 2000MHz models appear to be able to run at the correct 400MHz and 333MHz memory clocks. Although it is unfortunate to have lost consistency across all models, the largest delta seems to be only 33MHz. In the grand scheme of things, that will not amount to much in terms of ‘real world’ performance. It will be interesting to see if/how AMD tries to address this in future processor revisions.
939 to AM2 Head to Head Testing
There have been so many tests done comparing Intel netburst based chips to AMD’s K8 that it makes for a very uninteresting read. The question that everyone wants answered is “How does AM2 perform compared to 939″? There were several examples of early AM2 processors leaked across the web, and many people were distressed to hear that those pre-production samples performed worse than their 939 counterparts. AMD has clearly had a lot of time to tweak the architecture and I’m a big believer in ‘wait and see’. It’s time for the final judgment–this 5000+ I have in my possession is a final production model that the average consumer will be able to purchase as of May 23rd, 2006.
The most difficult thing I faced in writing this article, was trying to setup a fair comparison between the two platforms. It proved to be a difficult task for many reasons. Clearly, two totally different platforms with different mainboards makes for an unfair comparison to begin with but my goal was to ensure that all other variables were controlled as tightly as possible. To better understand my rationale for testing the way I did, I’m going to outline the challenges I faced and my solutions.
Challenge #1: The DDR2-800 setting at 2600MHz defaults to ~371MHz, and therefore would not be a fair comparison to 200MHz DDR memory. (See the previous section). On top of that fact, there are no 939 X2 processors operating at 2.6GHz and 2x512KB cache.
Solution #1: I decided to perform the comparison at 2.4GHz using the 12X multiplier. This allowed me to get the memory to a ‘true’ DDR2-800MHz speed, and provides an operating frequency used by some 939 processors as well.
Challenge #2: Once at a ‘true’ DDR2-800 speed, the Corsair memory proved to be unstable at 4-4-4-12 timings (50+ memtest86 failures in test 5). Despite my best efforts, I could not get this memory stable at those timings and 400MHz. I contacted AMD regarding this, and it indeed should have worked at these timings (as verified in their lab). Due to time constraints, I could not replace the modules.
Solution #2: I had no choice but to loosen the Trcd and Trp to values of ’5′ for 100% stability. Since high performance DDR2-800 should have 4-4-4 timings (which would be a fair comparison to 2-2-2 DDR400), I increased Trcd and Trp on the 939 platform as well. I will be comparing DDR2-800 @ 4-5-5-12 to DDR400 @ 2-3-3-5.
Challenge #3: I do not have a 939 4600+ 2.4GHz processor for comparison testing, but I do have a 4200+.
Solution #3: By simply running the 4200+ at 10x240MHz, I obtained a clean 2400MHz. I also used the 5/6 memory divider which provided a memory clock of 200MHz exactly. I did some thorough stability testing and the processor was fine at this clock speed. The HTT bus was operating approximately 4% below default, but extensive testing has shown that this has absolutely no impact to performance.
Challenge #4: The ASUS M2N32-SLI increases the reference clock speed by 0.9MHz, a sort of ‘minor overclock’ out of the box. Although 0.9MHz may not seem significant, when multiplying that by the 12x CPU multiplier, it amounts to 10.8MHz–enough to skew benchmark results.
Solution #4: To counteract this 10.8MHz increase, I used 10×241 on the 939 system. This gave me 2410MHz, a delta of only 0.8MHz away from the AM2 system. This was the absolute closest I could get.
After addressing all of the above challenges, I feel confident that I have setup a fair comparison. Since it is not apples to apples, and could not possibly become apples to apples, I will consider any benchmark difference of less than 1% to be the same.
Here are CPU-Z screenshots of the two configurations:
Note: 2x512MB DIMMs were utilized for both platforms and 1T command timing was set for consistency (ASUS M2N32-SLI defaults to 2T).
Without further ado, here are the results:
939 to AM2 Head to Head Testing: PCMark 2004
PCMark 2004 is one of my personal favorites to gauge a variety of CPU intensive tasks.
For all intents and purposes, the AM2 performed very closely to the 939 system. The 939 system took the lead by a very small margin in a few tests, and the AM2 took the lead in DivX compression. Five of the nine tests were well within the margin of error.
939 to AM2 Head to Head Testing: Sisoft Sandra 2007
Sisoft Sandra 2007 provides a great suite of system benchmarks, including some processor/memory intensive tests.
As far as raw CPU power is concerned (Arithmetic/Multi-Media) both the AM2 and 939 performed almost identically.
As expected, the AM2 system had over 40% more memory bandwidth. We’ll be taking a closer look at latency in future sections.
939 to AM2 Head to Head Testing: SuperPI 1.4 Mod
SuperPI is a classic that has become a staple benchmark for combined CPU/Memory performance. SuperPI scores can be impacted heavily by memory timings and frequency.
It appears that SuperPI was not hurt by the loose timings at all. The performance delta is almost negligible, but the AM2 is just slightly on top.
939 to AM2 Head to Head Testing: 3D Benchmarks
I decided to perform a mix of new and old 3D benchmarks for some diversity. Older 3D benchmarks are very CPU dependant for high scores, and the newer ones are more GPU dependant.
Aquamark 3 saw very little variation, with the AM2 just slightly on top. With an X850XT, Aquamark3 is CPU dependant for higher scores.
3DMark 2001 saw very little variation. I was surprised to see this result, as low latency memory timings can have a very large impact on this benchmark. Clearly, the doubling of the frequency offset the timing advantage.
3DMark 2003, 2005 and 2006 are much more GPU limited with an X850XT, however the results are still to close to distinguish a clear leader.
939 to AM2 Head to Head Testing: 3D Gaming
I usually like to provide at least one Direct3D and one OpenGL game for comparison. Farcry 1.3 and Doom3 are both fairly modern games that have some CPU dependency for high frame rates (With an X850XT).
Regardless of AA and AF settings, the two platforms are neck and neck with less than a 1% delta.
Doom3 is a popular OpenGL based game with a built in time demo. The command ‘timedemo demo1 usecache’ was used for benchmarking. Once again, the delta is negligible with the AM2 just slightly ahead.
939 to AM2 Head to Head Testing: Folding @ Home
This just wouldn’t be Short-Media without some F@H performance benchmarks. As most people are aware, timeless ‘Tinker’ work units take advantage of AMD’s architecture and generally provide the highest PPD output. For this benchmark, I simply let each configuration crunch away for about 20 minutes or so and then calculated how much time it took between frames.
Once again, the two came out very close. I’d consider that two second difference with the margin of error. It is very positive to know that 939 and AM2 will both crunch the heck out of timeless ‘Tinkers’. If you don’t know what Folding @ Home is, visit www.joinfolding.com and consider putting your spare CPU cycles to good use!
939 to AM2 Head to Head Testing: Drawing Conclusions
Well, there you have it. Socket 939 and AM2 perform almost exactly the same when control variables are kept in check (see the ‘challenges’ I mentioned earlier). Obviously if you paired an AM2 chip with some cheap DDR2-667 with 5-5-5-15 timings, the 939 with tight timings will walk all over it. I can say the exact same thing if I were to compare an AM2 at DDR2-800 with 4-5-5-12 timings and a 939 with 3-4-4-8 timings. Put simply: The platforms are equal and the memory used is the key to performance.
Overall memory latency is not only impacted by memory timings, but by memory frequency as well. As can be seen below, the AM2 actually has a slight advantage in overall memory latency over the 939 at 2-3-3-5 timings. The high 400MHz clock speed helped to offset the looser timings. AMD’s ultra fast integrated memory controller has also helped to minimize the performance degradation associated with loose timings.
When comparing 2-2-2-5 timings on the 939 platform, there is a slight advantage over the AM2 at 4-5-5-12. It is unfortunate that I could not compare DDR2-800 at 4-4-4-12 timings to the 2-2-2-5 timings, but I suspect that the delta would be similar to what you see above between 4-5-5-12 and 2-3-3-5.
Although the benchmark results were almost identical, this leaves AMD is a favorable position. DDR2-800 does not hit a wall at 4-4-4 timings. We are already beginning to see DDR2-800 at ultra low 3-3-3 timings and DDR2-1066 at tighter timings as well. Once these DDR2 modules go mainstream, I don’t doubt that the performance delta will be significantly larger in the favor of AM2.
Since this is the direction the industry is heading, I’m happy to report that AMD has made the best of DDR2.
Here is a quick comparison chart outlining all of the tests and their outcomes:
Benchmark AM2 4-5-5-12 939 2-3-3-5 Delta Verdict PCMark 04 – File Compression (MB/s) 6.761 6.738 0.34% SAME PCMark 04 – File Encryption (MB/s) 74.509 74.598 -0.12% SAME PCMark 04 – File Decompression (MB/s) 60.195 60.281 -0.14% SAME PCMark 04 – Image Processing (MB/s) 29.425 29.530 -0.36% SAME PCMark 04 – Audio Conversion (KB/s) 3196.964 3199.182 -0.07% SAME PCMark 04 – Web Page Rendering (Pg/s) 6.120 6.307 -2.96% SLOWER PCMark 04 – WMV Video Compression (fps) 86.564 88.433 -2.11% SLOWER PCMark 04 – DivX Video Compression (fps) 90.616 89.588 1.15% FASTER PCMark 04 – Physics Calculation and 3D (fps) 227.964 235.627 -3.25% SLOWER Sandra Proc Arithmetic (Dhry) 17494.000 17518.000 -0.14% SAME Sandra Proc Arithmetic (Whet) 14801.000 14822.000 -0.14% SAME Sandra Proc Multimedia (Int) 45364.000 45421.000 -0.13% SAME Sandra Proc Multimedia (Float) 49421.000 49484.000 -0.13% SAME Sandra Mem Bandwidth (Int) 7692.000 5406.000 42.29% FASTER Sandra Mem Bandwidth (Float) 7652.000 5364.000 42.65% FASTER Sandra Random Access Latency (ns) 91.000 97.000 -6.19% FASTER SuperPI 1M (s) 36.438 36.859 -1.14% FASTER SuperPI 2M (s) 82.703 83.562 -1.03% FASTER SuperPI 4M (s) 179.718 181.828 -1.16% FASTER Aquamark 3 Total 82712.000 82262.000 0.55% SAME Aquamark 3 CPU 10812.000 10790.000 0.20% SAME 3DMark 2001SE 26643.000 26624.000 0.07% SAME 3DMark 2003 13015.000 12984.000 0.24% SAME 3DMark 2005 6347.000 6356.000 -0.14% SAME 3DMark 2006 2221.000 2221.000 0.00% SAME 3DMark 2006 CPU 1845.000 1848.000 -0.16% SAME Farcry (800×600) Low 162.600 161.550 0.65% SAME Farcry (1024×768) Low 159.680 159.110 0.36% SAME Farcry (1280×1024) Low 126.650 126.470 0.14% SAME Farcry (1024×768) High 132.760 133.880 -0.84% SAME Farcry (1280×1024) High 90.750 91.330 -0.64% SAME Farcry (1600×1200) High 67.300 67.630 -0.49% SAME Doom3 (800×600) 160.000 158.300 1.07% FASTER Doom3 (1024×768) 142.500 141.300 0.85% SAME Doom3 (1280×1024) 110.300 109.400 0.82% SAME Doom3 (1600×1200) 50.000 49.900 0.20% SAME Folding at Home Tinkers (s) 258.000 256.000 0.78% SAME
Overclocking the 5000+
Overclocking the AM2 based 5000+ was really nothing new. Since the reference clock frequency, the HTT bus and the CPU multipliers remain as-is, it was just like overclocking a 939 chip. About the only thing that I needed to get used to were the new memory dividers. Since DDR2-400/667/800 could be selected in the BIOS, it was a simple matter of keeping the DDR2 frequency in check. I set it to DDR2-677 for my testing and easily got up to 2800MHz. Conveniently, the DDR2 frequency crept up to a predictable 400MHz, and I adjusted the timings appropriately.
As you can see above, I increased the Vcore, but it was not required. I was able to maintain Prime95 stability on both cores at default Vcore. Unfortunately, I hit a pretty stiff wall after that. I could not obtain 2900MHz with the limited 1.36V selectable in the BIOS. I believe that this chip could hit close to 3GHz on quality air cooling and higher voltage.
I contacted Asetek to see if they had an AM2 evaporator kit for my Vapochill LS but unfortunately they do not have one available yet. I would have liked to check this processor for the notorious ‘cold bug’ and to see how much more headroom the processor had on phase-change.
It certainly appears that AMD has perfected their 90nm process. It is amazing to see such high clock speeds at such low voltages. I am very curious to see how much more overclocking headroom AMD’s new energy efficient processors possess as well.
Comparison to Other AMD Processors
The head to head comparison in the previous section was done at 2.4GHz. Since this is indeed a 5000+ rated for 2.6GHz, it seems only fair to pit it against AMD’s entire 939 lineup to see where it fits into the grand scheme of things. In this comparison, I will not try to circumvent the underclocking nature of the memory at DDR2-800 speeds. This will be an ‘out of the box’ test at the default 5000+ configuration (see the ‘IMC Quirks and DDR2-742?’ section for more details). Since the memory is going to be running at only 371MHz, I will keep the timings at the tighter 4-4-4-12 configuration. You’ll notice that there is also an ‘AM2 @ 2.8GHz’ item in each test. This is the overclocked configuration I outlined in the previous section.
This is the configuration used for the ‘AM2 5000+’ in each test:
To emulate all of the 939 processors, I used a 4200+ and a 4400+ with varying multipliers. To emulate the 4600+ and 4800+, I used the 939 configuration outlined in the ‘Head to Head Testing’ section (10×240, not 10×241). For all of the 939 configurations, 2-2-2-5-1T timings were used and the memory was kept at exactly 200MHz.
Comparison to Other AMD Processors: Sisoft Sandra 2007
When looking at the above ‘Arithmetic/Multi-Media’ tests, it is pretty clear that the 5000+ fits very linearly into the product lineup. Raw CPU clock speed seems to make much more of a profound difference than memory timings in these tests.
No real surprise here. The AM2 based processors lead the pack in memory bandwidth. As you can see above, the memory performance increases as the CPU clock speed increases (due to the memory controller operating at the same speed as the CPU core).
At first glance, the ‘Random Access Latency’ graph above may look like a mess but there is a pattern. You’ll notice that the processors with 1MB of L2 cache per core (the Opterons, the 4400+ and the 4800+) generally have lower overall latency. Since this benchmark tests cache as well as memory, there is a greater chance that data can be found in the extra high-speed L2 cache on these models. The latency is also lower with higher speed processors due to the nature of the IMC. The 5000+ has higher overall memory latency than any of the processors tested. The fact that the memory is operating at 371MHz likely contributed to this. Thankfully the 5000+ has over 40% more memory bandwidth to help offset this higher latency. The overclocked 5000+ with it’s memory at a ‘true’ DDR2-800 had a fairly low latency compared to the default 5000+ configuration (even though the timings are higher). This graph is an interesting way to illustrate the differences between the two platforms, but there is a lot more to the overall picture than just memory latency.
Comparison to Other AMD Processors: SuperPI 1.4 Mod
Again, it was nice to see some fairly linear results. SuperPI is generally very dependant on high performance memory for the best scores. The overclocked 5000+ (with its memory at a ‘true’ 400MHz) really dominates the chart. I’d be curious to see where the FX-62 fits above. I’d guess that it could come close to breaking 30s in the 1M test.
Comparison to Other AMD Processors: 3D Benchmarks
It looks like 3DMark 2001SE didn’t like the higher memory latency of the 5000+ in its default configuration. This particular benchmark is very sensitive to memory performance. Since 3DMark 2001 is so CPU limited, a decrease of 500FPS to 480FPS does not amount to much in the real world. Aquamark3 remained fairly linear, however the delta between the 4800+ and the 5000+ is pretty small.
The more modern benchmarks seem to be much more linear, and the 5000+ fits right into the lineup. 3DMark 2005 and 2006 are a much more accurate representation of modern gaming than 3DMark 2001 and Aquamark.
Comparison to Other AMD Processors: 3D Gaming
Again, a similar picture is painted. Low resolution gaming does take a small hit due to the increased latency of the 5000+ in it’s default configuration. This is most evident at 800×600 gaming. In a system with a very powerful GPU (SLI or Crossfire), there will likely be a similar decrease across the board.
Comparison to Other AMD Processors: Folding @ Home
Raw processing power is what matters most when crunching tinker WUs. The 5000+ fits very linearly into the lineup.
Comparison to Other AMD Processors: Drawing Conclusions
For the most part, the 5000+ fits nicely into AMD’s lineup and the performance is fairly linear across all models. Unfortunately, the increased latency at the 5000+ default configuration does take a minor toll in CPU limited 3D gaming. Since the memory runs at only 371MHz, tighter timings will be required to offset this. As you saw in the head to head comparison, the chip should have clearly outperformed the 4800+ across the board if the memory was running 30MHz higher. As we begin to see more low latency (3-3-3) DDR2-800, this will no longer be an issue at all.
I think there will inevitably be a lot of disappointed folks out there who really wanted to see huge performance gains from AM2. It is important to note that AM2 was never intended to be a major overhaul of the architecture but rather some minor tweaks to address a changing industry. AM2 is, after all, still the same K8 processor running a different type of memory. I definitely would not recommend selling your modern 939 system and jumping on the AM2 bandwagon looking for performance. If you have an older system and want to upgrade–this platform has a lot of future potential.
In my personal opinion, AM2 is a total success. AMD managed to take high latency DDR2 memory and integrate it into its existing architecture (that was never designed for it) without sacrificing any performance. Not only that, but AMD has clearly perfected their 90nm manufacturing process and the entire AM2 lineup consumes less electricity than their predecessors. If that wasn’t good enough, you as the consumer will now have the opportunity to purchase ‘Energy Efficient’ AM2 models which take power consumption to a whole new level. The physical changes to AM2 are welcome as well. The four-bolt retention frame is something that I wish was incorporated into the 754/939 platforms since the beginning. Add some nifty virtualization tweaks and you’ve got yourself a good platform update.
DDR2 memory will inevitably mature to new performance levels. Timings of 3-3-3 are already emerging for DDR2-800 modules and will allow the integrated memory controller to perform more to its full potential. There may also be future support for DDR2-1066. It will be very interesting to see what the future will bring the DDR2 market. In any rate, faster DDR2 will make for a faster Athlon 64 processor.
“What about the Intel Conroe?” .. The K8 Hammer is a very smartly designed architecture with a lot of future potential yet. There are many design changes that can be done to significantly improve performance. Although this lineup of processors may not hold the performance crown for long, it may be this socket that does. With 65nm just around the corner, there could be some interesting developments from AMD in the not too distant future. There is an excellent article on the subject at Anandtech that I’d highly recommend: http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2748&p=1
All in all, I am pleased with what I have seen and I really look forward to seeing what the future will bring.
I’d like to sincerely thank AMD for sending us this AM2 review kit and for continuing to show support for this and other ‘tech oriented’ communities across the web. Another big thanks goes out to Paroxym for lending me his 4400+ for the performance comparisons. The poor guy has been stuck with my single core Opteron, and he’ll be the first to tell you that there IS a difference between single and dual cores.