If geeks love it, we’re on it

AMD HyperTransport Technology Explained

AMD HyperTransport Technology Explained

Supplied by AMD


AMD HyperTransport
technology may be the new era of PC performance and capability. HyperTransport
technology isn’t a radical new gadget but a new I/O bus that increases
the speed in device communication point to point. This may be old news to
some but the HyperTransport motherboard is coming…sooner than you think.

HyperTransport
technology was designed to be used as a high-speed interconnect between chips
inside a computer. It is not designed to be used for computers to communicate
to each other. The HyperTransport I/O link promises cheaper, faster and simpler
connections.

HyperTransport
technology can be applied to a myriad of products. In the scope of this article,
only the motherboard itself will be addressed to keep the focus relatively
narrow and comparitively brief. Performance for the PC user is most limited
at present by the motherboard itself. The motherboard as we know it is nearing
the end of its useful life. What will the motherboard of the future be like?
What does a HyperTransport technology enabled motherboard promise?

highway

What
is HyperTransport technology?

By AMD definition
the HyperTransport technology I/O link is a narrow, high speed,
lower power I/O bus that has been designed to meet the requirements of the
embedded markets, the desktop, workstation, and server markets, and networking
and communication markets.

To de-PR babble
this statement, HyperTransport technology simply means a faster connection
that is able to transfer more data between two chips. This does not mean that
the chip itself is faster. It means that the capability exists via the HyperTransport
pathway for one chip to “talk” to another chip or device at a
faster speed and with greater data throughput.

Think of a HyperTransport
I/O link as a highway between two cities with the cars being data; If there
is a lot of cars on a two lane highway, then there are going to be traffic
jams and possibly a few fender benders and scrapes. The HyperTransport bus
makes the highway wider and faster allowing for better traffic flow. This
does not mean the cars are any faster; that is up to the car builder but the
road is able to accommodate more cars that may have bigger engines and the
ability to carry more.

This highway
or BUS is an internal connection. On the motherboard level, the HyperTransport
bus connects all parts of the motherboard, such as the PCI slots, AGP slots
and USB ports to the CPU and memory and also provides the connection between
the CPU and memory itself (although it is a bit more complicated than this.
The Hammer processors actually connect to memory directly but use HyperTransport
to communicate with the memory of other processors in a multiprocessor configuration
since HyperTransport is also a coherent bus). At present, the internal PCI
bus used in current systems is a rather small road, limited to a small amount
of traffic, supporting a maximum data rate of 133 Megabytes per second. This
may seem like a lot of data, but when compared to the memory and CPU shoving
data around at 2.1 Gigabytes per second and video cards nearing 20 Gigabytes
per second, the internal bus is very limiting.

The whole point
of HyperTransport technology for the PC is to provide the ability for data
to be moved around faster and in larger quantities on the motherboard within
computer system.

What
isn’t HyperTransport Technology?

Quite simply,
HyperTransport technology does not and will not make EIDE hard drives faster
nor will it speed up USB or will it make memory or processors faster, although
it can drastically reduce the latency between these devices. However, HyperTransport
technology could be incorporated into the integrated circuits, such as the
forthcoming Hammer processors from AMD, and where applicable, in order to
assist in boosting performance.

So what…will it be faster?

The simple answer is yes, but how much faster depends on how
HyperTransport technology is implemented. Keep in mind the old saying, “you
are only as good as your weakest link.” HyperTransport is technology
that can be incorporated into any particular component or device in a PC.
It’s like a tune-up for the car engine. If all vehicles had the same
tune-up, then they would all run faster or have more horsepower or at least
have greater fuel efficiency; but each in their own way. HyperTransport technology
raises the performance bar in two ways.

1) Existing performance increases.

If a person were to attach existing components, such as the
video card, processor, ram and hard drive, on a HyperTransport technology-based
motherboard, the components themselves would not be faster but would have
the ability to talk to each other at a faster rate and with reduced latency,
just a few benefits that the bus provides. There would be a performance increase.

2) Future performance increases

The next step is to build all the components integrated with
HyperTransport technology allowing for the individual components themselves
to be faster, and then provide a pathway between them that is faster and has
the ability to handle more data. HyperTransport is a technology that provides
a new link between devices such as between an integrated video chip, the I/O
hub (South bridge), 64-bit connections such as PCI-X and PCI-X 2.0, memory
and the CPU. HyperTransport technology can be applied to nearly every pathway
that communicates data between two points. Therefore if HyperTransport technology
is applied to everything inside a desktop PC, then the performance bar is
raised even more.

agp8x

interconnectbandwidth

pci-x

usb20

How much faster?

This is not a wholly easy explanation. How much faster depends on many factors
and what particular components that are being discussed. There are a lot of
big numbers being tossed around and the biggest claim AMD is making is an
aggregate throughput of 12.8 Gigabytes per second on a 32-bit HyperTransport
link. Aggregate means the total amount of data throughput in both directions
and does not mean that a particular pathway between two points operates at
12.8 gigabytes per second. The whole picture is rather complicated and it
is the proverbial house of cards; both in how every card relies on the other
and how each particular card is built.

But it’s not that easy.

In order to reach these staggering bandwidth speeds, the HyperTransport bus
must be communicating at its peak speed of 800 MHz and there needs to be a
32-bit link (HyperTransport supports 40-bit addressability right now and will
support 64-bit addressability early next year. The addressability has nothing
to do with the width of the link. HyperTransport supports link widths of 2,
4, 8, 16, and 32-bits). Currently AMD processors operate at a 32-bit, 266
MHz front side bus speed with a 333 MHz front side bus speed processor just
around the corner. With the Hammer processor, the front side bus speed disappears
as the memory controller is integrated directly into the processor core, eliminating
the need for a North bridge chip and a front side bus between the two.

HyperTransport technology supports clock rates ranging from 200MHz to 800MHz
and data paths may be 2, 4, 8, 16, or 32-bits wide, and, data paths do not
have to be symmetric in data width or clock rate, meaning upstream and downstream
data widths/transfer-clocks can be set to what is considered optimal for the
application.

But it’s still not that easy.

Not only is there the CPU connection, but a whole host of other connections
with their inherit challenges. The HyperTransport highway is there but many
players must come to the table to properly utilize it. Each hardware manufacturer
has to fully incorporate HyperTransport technology…from the chips supplied
to other manufacturers to the connections that are between these points made
by the makers of the end product such as video cards, RAM and sound cards.
The good news is that software designers will not have to re-engineer the
device drivers, the BIOS or the operating system to properly use the new faster
link as HyperTransport technology is fully software compatible with PCI, preserving
PCI’s command protocol in three important ways; PCI enumeration in real-time
operating systems, specific PCI silicon and device drivers, and PCI-specific
development and debug tools.

It is true that the current motherboard chipset and bus is reaching the end
of its life and a new platform is on the horizon but it will take time.

Everybody on the bus.

Yes Virginia…there is HyperTransport technology. ALi, NVIDIA, and AMD have
tabled chipsets that make use of HyperTransport technology and NVIDIA has
produced partial HyperTransport technology motherboards.

ALi
Corporation
introduced the M1687 North bridge, which only controls communication
with AGP since the memory controller is integrated into the processor core,
and the M1563 South bridge, supporting the AMD Opteron™ and AMD Athlon™
processors based on Hammer technology in early 2002. ALi’s M1687 is
equipped with AGP-8X support and a HyperTransport™ bus to link to M1563
South bridge. The M1687 North bridge chipset can reach up to 6.4 Gigabytes
per second between it and the AMD Hammer CPU and up to 1.6 Gigabytes per second
between the north and south bridge chips in each direction.

NVIDIA
counters with the nForce and nForce2 family of media and communications processors
that has an equal compliment of descriptive words like “stunning”,
“uncompromising”, “impressive” and “feature-rich”.

nforce2-chiplayout

AMD
has taken a subtly different approach with the AMD-8000 series chipsets, mainly
the AMD-8151 HyperTransport AGP 3.0 Graphics Tunnel, the AMD-8131 HyperTransport
PCI-X Tunnel and the AMD-8111 HyperTransport I/O Hub (South bridge).

mobo_hyper_northbridge

mobo_hyper_southbridge

NVIDIA has presented a solution to use current technology in conjunction
with HyperTransport technology. The connection between the NVIDIA nForce IGP
(Integrated Graphics Processor), SPP (System Platform Processor), and the
NVIDIA MCP/MCP-T (Media & Communications Processor) is built on HyperTransport
technology. The maximum throughput between the two processors is 800 Megabytes
per second (400 Megabytes per second each way); a far cry better than the
133, 266 or even 533 Megabytes per second connections that currently exist.

nforce2-IGP

nforce2-MCP

NVIDIA entered the HyperTransport technology based market with an IGP built
with an integrated GeForce2 MX graphics core running at 4x AGP. The public’s
performance expectations were high and nForce debuted with mixed results.
The nForce chipset didn’t fail as the board was feature-rich with video,
audio, LAN, and USB and the price was right. The on-board video was the Achilles
heel, dragging down the performance benchmarks compared to the GeForce3 equipped
motherboards. If a person were to bypass the onboard video in favor of an
add-on video card, then nForce boards kept pace with the competition.

The NForce2 chipset will come equipped with GeForce4 MX on-board video which
will satisfy many frame hungry gamers though it is the mainstream series of
GE Force4 which has less “oomph” than the Ti series of the retail
card.

ALi takes a side step and, perhaps, a step up by incorporating HyperTransport
technology from the CPU to the north bridge and finally to the south bridge.(This
is a standard implantation actually; they are using the term North bridge
while AMD uses the AGP3.0 Graphics Tunnel. They offer the same functionality.
A North bridge typically contains a memory controller whereas the ALi part
does not since the memory controller is integrated on the Hammer processor
core) Speeds can reach up to 6.4 Gigabytes per second between the CPU and
the north bridge and remain fixed at 1.6 Gigabytes per second (3.2 Gigabytes
per second aggregate throughput) on the connection between the North and South
bridges.

AMD has taken the biggest step forward with the 8000 series. When referring
to these chipsets, the traditional terms of “North bridge” and
“South bridge” are quickly corrected and replaced by “Tunnel”
and “Hub”. (You may want to consider putting this earlier in the
article to avoid the confusion above)

The AMD-8111 HyperTransport I/O Hub replaces the traditional South bridge
chipset and connects devices such as the PCI slots, EIDE channels, Ethernet,
audio, and USB, among others. In the case of a motherboard, the I/O Hub then
connects to the 8151 HyperTransport AGP3.0 Graphics Tunnel for connecting
video devices and finally through to the processor itself. The term “tunnel”
is exactly what it means or, perhaps, “T” would be a better description.
A data signal can pass right through the HyperTransport Tunnel on its way
to another point or the tunnel itself can connect a secondary data signal
to the main data highway. Since HyperTransport has such a high data bandwidth,
it can support multiple tunnel chips, each of which can support full bandwidth
to numerous connections such as PCI-X. So, for example, if a server implementation
required that all four PCI-X slots required maximum bandwidth capabilities,
it would be simple just to add additional HyperTransport to PCI-X tunnel chips.

Everybody off the bus

The problem with
any of these HyperTransport solutions is that current devices that connect
to the motherboard via 33, 66, 100 or 133 Megabytes per second connections
remain at those throughput limitations until the data enters the HyperTransport
highway. It’s like having a baseball player walk to first then run like
hell to home. It’s hoped that the average will show an overall performance
increase. (Although HyperTransport greatly reduces the latencies between these
devices and supports other benefits such as concurrency, which is data communications
in both directions simultaneously, something that PCI does not support)

But why not apply
HyperTransport technology to peripherals like removable drives, USB or FireWire?
Sorry. HyperTransport technology only provides chip to chip high performance
connections within the system core such as on the motherboard. If a device
such as a hard drive is connected, it has a unique interface that operates
at a specific throughput. Hard drive throughput connections may be increasing
along the cable as evidence by SerialATA (150 Megabytes per second) but that
is the limitation of the drive technology. If the data travels along the hard
drive cable for a couple of feet, the extra 3 inches on the motherboard from
the header to the hub (or South bridge) aren’t going to matter. There’s
no performance gain having 3 inches of big pipe to the EIDE headers from the
hub. However, if large amounts of traffic were coming in from multiple connections
at the same time, such as from EIDE, 1394, broadband Network, and devices
on the PCI slots, the HyperTransport bus could more than handle the traffic.

Big Performance Promises.

The public relations spin on huge bandwidth and speed increases
doesn’t mean much until the user experiences it. A completely HyperTransport
enabled PC doesn’t exist as of yet and may not for some time. But let’s
break down this bandwidth promise connection by connection and stage-by-stage
to see what it means and when it will come about.

Currently a readily available motherboard looks
like this.

present_mobo_speeds

Motherboard performance could increase taking
ALi’s HyperTransport based chipset into account.

ali_hyper

Why do all HyperTransport diagrams show the ram coming directly
off the CPU instead of to the northbridge? – Because the memory controller
on the Hammer processors is build directly into the processor core. It is
no longer contained in the North bridge. This greatly reduces memory latency
and allows the memory controller to run at the full speed of the processor
instead of the speed of the front side bus, completely changing and revolutionizing
the method for the way x86-based processors access main memory. The multiprocessor
AMD Opteron processors based on Hammer technology incorporate a dual-channel
DDR DRAM controller with a 128-bit interface capable of supporting up to eight
DDR DIMMs (four per channel). When used in conjunction with PC2700 memory,
rated at speeds of 333MHz, the available memory bandwidth available to the
processor becomes equivalently 5.3 Gigabytes per second. And, since the memory
controller is now operating at the same gigahertz speeds as the processor,
as processor frequency scales, the latency is further reduced.

Understanding what HyperTransport technology can do is quite
easy after you’ve read the 3 or 4 hundred pages that I have. It’s
all math and HyperTransport is only the pathway. The clock speed and bit length
at which the data is sent determines the potential throughput. Even though
a processor may have a 64-bit bus, it still has to communicate, with current
processor availability, at a maximum of 32 bits. The AMD Hammer will change
that as it is designed for a 64 bit system. (32 Bit versions will be available
as well)

HYPERTRANSPORT
BANDWIDTH SCALABILITY CHART
Clock
Rate
Link
Width
2
bits
4
bits
8
bits
16
bits
32
bits
200
MHz
0.8
Gbps
1.6
Gbps
3.2
Gbps
6.4
Gbps
12.8
Gbps
400
MHz
1.6
Gbps
3.2
Gbps
6.4
Gbps
12.8
Gbps
25.6
Gbps
500
MHz
2.0
Gbps
4.0
Gbps
8.0
Gbps
16.0
Gbps
32.0
Gbps
600
MHz
2.4
Gbps
4.8
Gbps
9.6
Gbps
19.2
Gbps
38.4
Gbps
800
MHz
3.2
Gbps
6.4
Gbps
12.8
Gbps
25.6
Gbps
51.2
Gbps

So
how would a full fledged HyperTransport motherboard possibly look? It will
be interesting as you’ll note that the system memory connects directly to
the processor.


That’s
how HyperTransport technology will change the design landscape of the motherboard.

Say Again?
How will HyperTransport change the motherboard?

mobo_hyper_southbridge

HyperTransport
technology brings many benefits to motherboard PCB design. It reduces the
amount of traces (wires) needed to connect points. It lowers the voltage required
for points to properly operate. It simplifies the overall complexity of PCB
design.

Reducing the
number of traces required for connecting two points; be it chip-to-chip or
PCI slot to CPU reduces complexity of design. A motherboard PCB is made up
of a series of layers of connections. There just isn’t sufficient room
on a motherboard to connect every point on a single layer. The solution is
to stack connection layers on top of each other and connect them two-dimensionally
and three-dimensionally through the layers. If the amount of traces is reduced
then three things can happen. The first is that the motherboard itself is
simpler to design and produce, theoretically resulting in a lower cost. The
second is that the reduction in traces results in more space allowing for
more devices to be attached on an existing motherboard size. The third would
be an overall reduction in motherboard size still keeping the same amount
of available device connection.

In the diagram
below, the AGP link on the right (purple and red) is an example of the current
amount of traces required and on the left is the same AGP link built on HyperTransport
technology.

hyper_vs_agp_traces

The result of
less cost or more devices or smaller size depends on the manufacturer –
consumer equation. The manufacturer wants to sell more products and this depends
on what the consumer wants or rather is told what they want.

Will the HyperTransport
motherboard of the future be a feature rich PCB that greatly surpasses the
performance standards of today yet fits into a bread loaf sized box? Or will
it be redesigned to allow for a greater number of PCI slots, peripheral connections
and RAM slots?

To answer this
question it is best to think in what is easiest for the consumer to transition
to. 64-Bit processors, such as the AMD Hammer processors, allow for the 4-Gigabyte
memory barrier to be lifted to a ridiculous level of terabytes. HyperTransport
technology would allow PCB manufactures to include more RAM DIMMs allowing
for 6 or 8. It may be thought as a bit silly to include more DIMM slots when
a consumer can go out and buy new RAM that is of greater size. But most consumers
find it easier to justify buying such components in stages. Consumers may
not like being forced into discarding the old for the new. Consumers want
to get more for less. To that end a PC user who has two or four 256 MB modules
may be more easily persuaded into buying 2 or 4 more 256 MB modules to add
to the existing amount. It depends on what they can afford or want to pay
at the given time. This cost saving by not wasting existing components makes
it easier for transition to new technology.

This is of course
in addition to the lure of a staggering performance increase.

Lowering the
voltage may not matter to the end user as much as it does to the technology
designer and manufacturer. Increasing data throughput requires more voltage.
As voltage passes through a connection or wire it generates an electrical
field which can influence the wire next to it. This is known as crosstalk
and crosstalk is similar to hearing someone else’s conversation that
you didn’t call on your phone. For a computer this is bad.

crosstalk

The solution
is to reduce the voltage and make connections more efficient in design. HyperTransport
technology utilizes LVDS (Low Voltage Differential Signaling).

LVDS does not
mean that a 100-watt power supply may now be needed instead of a 300 or 400
watt power supply. It means that the power connections to points on the PCB
are simplified. One connection to a point is sufficient where two were needed
before. Also the complexity and placement of power connections on the PCB
is reduced. It’s best not to think of this like plugging a PC into a
wall socket for power. Every chip, diode, mofset and little “fiddly-bit
thing” on a motherboard requires power each must be hooked into supporting
things like power regulators and such. Present motherboard design requires
certain components to be placed relative to each other in order to function.
HyperTransport technology takes away a lot of the design limitations.

The third design change could be a radical change in motherboard
PCB size. Smaller is better and this could mean a PC placed inside the space
less than that of today’s small scale PC’s such as Shuttle’s
barebones systems. Remember that space reduction is a result of HyperTransport
technology. This same space reduction can be applied to video card PCBs and
sound card PCBs. Also feature rich motherboards are ebbing their way into
today’s market as evidenced by ABIT’s AT7 series. These motherboards
pack on features that satisfy the demanding PC consumer and only require the
addition of a video card, hard drive, processor, disc device, floppy and ram.

syste_architecture

Consumer reaction

Will the consumer readily accept these types of systems? On
one side is the DIY PC enthusiasts that prefer to choose their own components
and not run the risk of having to replace an entire motherboard if one device,
such as the sound card, fails. The other side is the pre-built buyer who wants
little hassle and an acceptable price. The “Dell” buyer will be
attracted to all the add-on features and enjoy the simpler choices of amount
of ram, size of processor and whether they get a DVD, CD or CDRW.

From a marketing point this will be very lucrative to system
builders such as Dell and Gateway. They will be able to attract new and return
buyers and keep profit margins acceptable. It will definitely mean a cheaper
motherboard to manufacture that will allow system builders to increase prices
marginally but, more importantly, increase their profit margin. Moving to
the next level it will allow for more on-board features to be put on a motherboard
again at the less expensive/bigger profit margin level.

It will be interesting what the design implementation will
be for each “new model year” as HyperTransport technology moves
into the marketplace.

It is for certain that change will come but it will be gradual.
In 5 years time the 19″ PC tower may be a memory and this opens up a
whole new area of home entertainment-communication–workstation products.

The twist to this lies in the hallways of manufacturers and
vendors. Pulling in the opposite direction will be those that make their living
from producing add-on products such as sound cards, video cards, wireless
PCI devices and more. These people want to sell you specialty products that
are “newer, better and faster” than what could come on-board the
motherboard PCB and they don’t want you buying one either. They want
you to upgrade, upgrade, upgrade. Will those manufacturers ally themselves
with motherboard makers such as ATI and NVIDIA have done with on-board video?

Is this the end of choice?

No. The world doesn’t work that way. The perfect product would be quickly
buried someplace never to see the light of day in fear it may bring the economy
to a grinding halt. Consumers like choice and “one size does not fit
all”. The industry will evolve and as technology increases the quality
of devices like on-board audio it is certain there will always be a bigger-better-faster
add-on product as an alternate choice.

This is only the beginning of a new era in motherboard technology and performance.
HyperTransport technology brings the promise of the “end of the bottlenecks”
and it certainly will be fun to run those choices on a dual 64-bit processor
motherboard with 12.8 Gigabytes per second of peak aggregate available bandwidth…wouldn’t
it?

mobo_hyper_southbridge_top

The author wishes to thank those at AMD who contributed their expertise and
patience. For more reading visit AMD
and also the Hypertransport
website.

Some images representational.

Comments

Howdy, ! Got something to say?