GPU Folding so far......

KrazeyivanKrazeyivan Newcastle, UK
edited October 2006 in Folding@Home
Hi All

Just thought I would keep you up to date - early days - I am keeping the X1900XT at 2D clocks to start with. Cat 6.5 drivers and latest DirectX9.

Am running 1 GPU and 1 CPU - both cores are flat out - seems the first CPU is spending all its time sending data to the card.

I am still not sure if EM3 works with this yet, but I can tell you this that with project 2725 (run 0 Clone 248, Gen 0) I have 5% complete in just under 30 mins.

No idea on points yet, just keeping you up to date.

oh I nearly forgot - GPU temps via ATItool report 55C and 16.4A (this normally sits at 5.5A at 2D speeds)
«1

Comments

  • QCHQCH Ancient Guru Chicago Area - USA Icrontian
    edited October 2006
    AWESOME news... keep us informed on this!!! :thumbsup:
  • Sledgehammer70Sledgehammer70 California Icrontian
    edited October 2006
    Seems to use a bit of power to do the calc's :( I bet power efficacy wise a CPU wins by a land slide.
  • FoldingAddictFoldingAddict Montgomery, AL
    edited October 2006
    It's actually not awesome news. I've been reading over at OCF that on a core 2 duo based machine, it takes a whole core to feed the ATI card that is folding a unit. So you lose one folding core in your dual core, to feed the card, while the other core continues. The thing is, GPU folding so far is only worth about 450 points per day, and on a fast C2D machine, that actually produces a loss in points.

    No doubt the project is better off with the GPU folding instead of the other core, but the current point levels provide no motivation to go out and purchase an X1900XTX. But it's still beta, who knows what's going to happen.

    ~FA
  • GargGarg Purveyor of Lincoln Nightmares Icrontian
    edited October 2006
    I expect processing efficiency and point attribution to greatly increase over time, and I appreciate these early beta testers giving it a shot!
  • KrazeyivanKrazeyivan Newcastle, UK
    edited October 2006
    Actual early testing by people (not me, but from what I read) run 2 CPU clients as services at 95%/low and a GPU NOT as a service at 100%/low and it runs OK.
    Seems at this stage that the GPU needs about 30% of the CPU time, but with the CPU having to wait (not entirely sure what that is) this shows the CPU usage up at 100%

    Also check out this link - note the GPU.........and number!

    http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats
  • KrazeyivanKrazeyivan Newcastle, UK
    edited October 2006
    Just a quick update - the log shows I get to 85% then it stops reporting - after a bit of detective work its actually fine - seems it refuses to tell you about the other 15% but it completes the rest of the unit. Stanford are aware of the issue too.
  • Sledgehammer70Sledgehammer70 California Icrontian
    edited October 2006
    Mac OS X - 4 - 7655
    Linux - 20 - 16936
    GPU - 15 - 208

    Wow so saying 250 GPU's would push more Tflops than 16,936 CPU's under linux??? good god....
  • the_technocratthe_technocrat IC-MotY1 Indy Icrontian
    edited October 2006
    Mac OS X - 4 - 7655
    Linux - 20 - 16936
    GPU - 15 - 208

    Wow so saying 250 GPU's would push more Tflops than 16,936 CPU's under linux??? good god....

    how is that even possible? has to be a mistake.
  • Sledgehammer70Sledgehammer70 California Icrontian
    edited October 2006
    It is like a 60 times performance increase :)
  • the_technocratthe_technocrat IC-MotY1 Indy Icrontian
    edited October 2006
    It is like a 60 times performance increase :)

    either:

    - the GPU client has some serious optimizations
    - GPU's have some huge architectural advantage for folding
    - the GPU client is reporting incorrectly
    - somebody's GPU is going to be a flame ball soon

    I can't see how that much computing power can be contained in a GPU without serious heat issues...:wow2:
  • shwaipshwaip bluffin' with my muffin Icrontian
    edited October 2006
    cpus are inefficient at floating point operations. gpus are built to do them massively parallel.

    since that chart show TFLOpS (Floating point operations per second), it makes since that gpus have a much higher flops/processor ratio.
  • the_technocratthe_technocrat IC-MotY1 Indy Icrontian
    edited October 2006
    shwaip wrote:
    cpus are inefficient at floating point operations. gpus are built to do them massively parallel.

    since that chart show TFLOpS (Floating point operations per second), it makes since that gpus have a much higher flops/processor ratio.

    In this case, 'massivlely' comes out to 71,428,571,428 flops per GPU. Them a lot of flops. :buck:
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited October 2006
    The thing is, GPU folding so far is only worth about 450 points per day, and on a fast C2D machine, that actually produces a loss in points.
    Not necessarily so. It all depends upon the work unit. The majority of work units are only worth about 150 points per day/per core. Sure, some will process at 600+ppd, but not the garden variety Gromacs.
  • the_technocratthe_technocrat IC-MotY1 Indy Icrontian
    edited October 2006
    I wonder what the newer physics cards could do...
  • shwaipshwaip bluffin' with my muffin Icrontian
    edited October 2006
    and each gpu runs at around 625 mhz, which means that the gpu does ~110 floating point operations per cycle. That is probably higher than the true number, but the x1900xtx has 48 pixel shader processors and 8 vertex processors.

    I don't know how efficient these are at floating point, or how they're being used, but the flops numbers probably aren't too inaccurate.
  • EnverexEnverex Worcester, UK Icrontian
    edited October 2006
    So it works out:

    Mac: 1913 per TF
    Windows: 1056.3 per TF
    Linux: 847 per TF
    GPU: 14 per TF

    So what exactly does that mean? The GPU client is the most efficient followed by Linux, Windows, etc? Or does it mean the GPUs are most powerful, then the processors of people running Linux etc?

    But there is one thing you're missing people, especially in statements like this "It is like a 60 times performance increase". The CPU guage is counting processors from as far back as anyone has reported workunits, so you're not comparing to the latest Core Duo or Athlon64 FX X2, this is also comparing to Pentium 90's and K6-2's all averaged out, so the insane performance increase may not be as phenominal as you think. In short you're comparing the average of all processors against the most powerful GPU ATi currently has.

    Also I'm curious what this means: "*TFLOPS is actual flops from the software cores, not the peak values from CPU specs."
  • shwaipshwaip bluffin' with my muffin Icrontian
    edited October 2006
    Enverex wrote:
    Also I'm curious what this means: "*TFLOPS is actual flops from the software cores, not the peak values from CPU specs."

    It means that they calculate this from the empirical data, rather than AMD's claim that their processor can do x flops.
  • Sledgehammer70Sledgehammer70 California Icrontian
    edited October 2006
    actiuually Enverex, I would be suprised is over 50,000 active CPU's in the wiondows based lineup are less than P3's maybe in the linup of 1,500,000 cppu's but my "60 times" was based on current active CPU's
  • EnverexEnverex Worcester, UK Icrontian
    edited October 2006
    actiuually Enverex, I would be suprised is over 50,000 active CPU's in the wiondows based lineup are less than P3's maybe in the linup of 1,500,000 cppu's but my "60 times" was based on current active CPU's

    True but I'm sure there are still quite a few underperformers of different sorts in there dragging the average down a lot.

    Isn't there any way to benchmark, er... 'terrafloppage' on a processor?
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited October 2006
    If anyone here is not familiar with GPU Folding@Home, take a look at Anandtech's article informative article on the topic.
  • edcentricedcentric near Milwaukee, Wisconsin Icrontian
    edited October 2006
    This makes me start thinking about Torrenza and the re-birth of co-processors.
    The reason that GPUs can put up such numbers is strictly architecture. A CPU (C2D) may have 300M tansistors, but how much of that is tied up in 4MB of cache and other overhead functions. Remember the P4 has over 200M transistors and it couldn't do enough math to save its name.
    In a GPU you have almost 400M transistors, the bulk of which are simply for crunching numbers.
  • Ultra-NexusUltra-Nexus Buenos Aires, ARG
    edited October 2006
    It seems the CPU cycles the GPU client takes are irrelevant, meaning that it doesnt matter if you have a c2d or a sempron... all the processing is still done at the GPU... this has to be confirmed though.
  • KrazeyivanKrazeyivan Newcastle, UK
    edited October 2006
    Well since my 85% problem - its finished about 2 hours later without further messages - it has auto updated core 10 to version 0.06 - not sure yet what the difference is.
    Oh the I have gone up in temp on the PWM 1c (41c) and the chipset has gone up 3c (43c) with running the GPU all the time.
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited October 2006
    Just when I thought I was happy with budget and mid-range video cards.... This is just not the year for Folding-specific upgrades. Need to purchase another vehicle, fly to the East Coast, pay for my daughter's wedding next spring...save cash for possible repairs on my 150K mile Blazer, 190K mile Astro...

    No I just can't justify purchasing an X1950....
  • KrazeyivanKrazeyivan Newcastle, UK
    edited October 2006
    From my log file.............oh and Leonardo I agree you cannot not justify getting the X1950..........now the DirectX10 card, thats another matter!!

    [18:54:08] *
    *
    [18:54:08] Folding@Home GPU Core - Beta
    [18:54:08] Version 0.06 (Tue Oct 3 07:59:02 PDT 2006)
    [18:54:08]
    [18:54:08] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86
    [18:54:08] Build host: CYGWIN_NT-5.1 vishal-gpu 1.5.19(0.150/4/2) 2006-01-20 13:28 i686 Cygwin
    [18:54:08] Preparing to commence simulation
    [18:54:08] - Assembly optimizations manually forced on.
    [18:54:08] - Not checking prior termination.
    [18:54:08] - Expanded 83063 -> 443705 (decompressed 534.1 percent)
    [18:54:08]
    [18:54:08] Project: 2723 (Run 0, Clone 305, Gen 0)
    [18:54:08]
    [18:54:08] Assembly optimizations on if available.
    [18:54:08] Entering M.D.
    [18:54:19] Completed 0
    [18:54:19] Starting GUI Server
    [19:01:33] Completed 1
    [19:08:47] Completed 2
    [19:16:01] Completed 3
    [19:23:14] Completed 4
    [19:30:28] Completed 5
    [19:37:42] Completed 6
    [19:44:56] Completed 7
    [19:52:10] Completed 8
    [19:59:24] Completed 9
    [20:06:37] Completed 10
    [20:13:52] Completed 11
    [20:21:07] Completed 12
    [20:28:25] Completed 13
    [20:35:42] Completed 14
    [20:43:03] Completed 15
    [20:50:23] Completed 16
  • the_technocratthe_technocrat IC-MotY1 Indy Icrontian
    edited October 2006
    wow, makes me think that we might redefine a 'budget box' as a barebones unit to support a fat GPU!
  • the_technocratthe_technocrat IC-MotY1 Indy Icrontian
    edited October 2006
    what kind of ppd are you pulling down with the GPU only?
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited October 2006
    From my log file.............oh and Leonardo I agree you cannot not justify getting the X1950..........now the DirectX10 card, thats another matter!!
    The only reason I would consider upgrading any of my video cards would be for MS Vista and/or Folding at home. But man, that's an expensive proposition if you don't give a whit about gaming. My computers all have excellent clarity, color reproduction, and 2D performance.

    Krazeyivan, yes, please keep up us updated. This is a major event for Folding@Home and for the future of GPUs.
  • lemonlimelemonlime Canada Member
    edited October 2006
    Don't forget that there are much more affordable (and still very powerful) X1900 series cards. The X1900GT is supported by the F@H GPU GUI and is very reasonably priced. The X1900XT 256MB is also very affordable in comparison to the rather new X1950 series cards.
  • edited October 2006
    The thing about gpu folding is that it is terrifically fast with the stuff that can be done with the gpu itself, but from what I understand is that it still needs a cpu core to process the part of the wu that can't be done by the gpu. So that occupies the core that is needed for processing the gpu wu somewhat and also slows the actual wu processing quite a bit too. So Stanford isn't actually seeing 20X more science being done by the wu.

    And if you don't own a X1900 class vid card right now, don't go out and spend 4 big one's to get it right away. Like has already been said, presently the points return isn't worth the investment right now if you are primarily folding for the points and not the science. But this is a rough beta client and they do need the gpu's folding to iron out the bugs. Plus, I'm sure that there will be some kind of adjustment in points values in the future, as well as Stanford eventually letting lesser ATI vid cards be able to process work, such as the X1650 and X1800 series too.

    And Leo, from all I've read on the next gen high end vid cards, the vid card will just be a part of the cost. Both Nvidia and ATI next gen vid cards look to be drawing some atrocious power; at least double to triple the cpu power draw. This will lead to ridiculous heat levels to have to deal with plus if they don't come with an external psu to drive them, make you have to upgrade to a $400-500 psu just to feed them.
Sign In or Register to comment.