Optimum folding configuration...?

RyanFodderRyanFodder Detroit, MI
edited June 2009 in Folding@Home
I have acquired my new system, and I now have some options for folding :)

I currently have a E6850 processor that I'm running a SMP client on for ~1500 ppd.

I also have 2xGeForce 9800GTX+ installed, currently in SLI mode, running one GPU2 client for ~6600 ppd.

I also have my old GeForce 8800 GTS sitting around, and I'm thinking about adding that as a third card. ~3500 ppd on previous mobo.

I do have one extra monitor sitting around, but no "dummy plugs." to set up on.

What sort of benefit would I get from running the two video cards seperate? I'm imagining that it would be 2x 6000 ppd for both cards, but I'd like to make sure its worth going through the trouble...

Thanks!

Comments

  • FoldingAddictFoldingAddict Montgomery, AL
    edited June 2009
    It is definitely worth setting up another instance of GPU folding to run on the second card. Just be aware you have to disable SLI mode, and extend your desktop to a second monitor. As far as the 8800, I don't know about mixing cards and folding. I think it's fine, but again you'll have to extend the desktop again.

    ~FA
  • RyanFodderRyanFodder Detroit, MI
    edited June 2009
    So, I've been fighting with trying to get both GPU's folding... and pretty much failing.

    I get this code back when trying to run both cores:


    [12:46:35] Project: 5765 (Run 5, Clone 111, Gen 61)
    [12:46:35]
    [12:46:35] Assembly optimizations on if available.
    [12:46:35] Entering M.D.
    [12:46:42] Working on Protein
    [12:46:43] Client config found, loading data.
    [12:46:43] Starting GUI Server
    [12:46:43] mdrun_gpu returned
    [12:46:43] NANs detected on GPU
    [12:46:43]
    [12:46:43] Folding@home Core Shutdown: UNSTABLE_MACHINE
    [12:46:46] CoreStatus = 7A (122)
    [12:46:46] Sending work to server
    [12:46:46] Project: 5765 (Run 5, Clone 111, Gen 61)
    [12:46:46] - Error: Could not get length of results file work/wuresults_09.dat
    [12:46:46] - Error: Could not read unit 09 file. Removing from queue.
    [12:46:46] - Preparing to get new work unit...
    [12:46:46] + Attempting to get work packet
    [12:46:46] - Connecting to assignment server
    [12:46:46] - Successful: assigned to (171.67.108.11).
    [12:46:46] + News From Folding@Home: Welcome to Folding@Home
    [12:46:46] Loaded queue successfully.
    [12:46:47] + Closed connections
    [12:46:52]
    [12:46:52] + Processing work unit
    [12:46:52] Core required: FahCore_11.exe
    [12:46:52] Core found.
    [12:46:52] Working on queue slot 00 [June 12 12:46:52 UTC]
    [12:46:52] + Working ...
    [12:46:52]
    [12:46:52] *
    *
    [12:46:52] Folding@Home GPU Core - Beta
    [12:46:52] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
    [12:46:52]
    [12:46:52] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
    [12:46:52] Build host: amoeba
    [12:46:52] Board Type: Nvidia
    [12:46:52] Core :
    [12:46:52] Preparing to commence simulation
    [12:46:52] - Looking at optimizations...
    [12:46:52] - Created dyn
    [12:46:52] - Files status OK
    [12:46:52] - Expanded 46651 -> 252912 (decompressed 542.1 percent)
    [12:46:52] Called DecompressByteArray: compressed_data_size=46651 data_size=252912, decompressed_data_size=252912 diff=0
    [12:46:52] - Digital signature verified
    [12:46:52]
    [12:46:52] Project: 5765 (Run 5, Clone 111, Gen 61)
    [12:46:52]
    [12:46:52] Assembly optimizations on if available.
    [12:46:52] Entering M.D.
    [12:46:59] Working on Protein
    [12:47:00] Client config found, loading data.
    [12:47:00] Starting GUI Server
    [12:47:00] mdrun_gpu returned
    [12:47:00] NANs detected on GPU
    [12:47:00]
    [12:47:00] Folding@home Core Shutdown: UNSTABLE_MACHINE
    [12:47:03] CoreStatus = 7A (122)
    [12:47:03] Sending work to server
    [12:47:03] Project: 5765 (Run 5, Clone 111, Gen 61)
    [12:47:03] - Error: Could not get length of results file work/wuresults_00.dat
    [12:47:03] - Error: Could not read unit 00 file. Removing from queue.
    [12:47:03] EUE limit exceeded. Pausing 24 hours.


    I have the latest Driver from NVidia (185.85) which is the only driver I have installed. The video cards are NOT in SLI mode, and I have my desktop extended on a second monitor. I installed the second program per the instructions listed on the FAH site.

    Any Ideas?
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited June 2009
    There are versions of 8800GTS with different shader counts, one with the G80 GPU and other with G92. Running multiple Nvidia GPUs in Folding with different shader/stream processor counts can cause problems. But usually those problems are manifested as low production on one of the GPUs, rather than "NANs detected." Turn off the Folding on the GTX GPUs for a while and see what the GTS does.

    If it folds well with the other GPUs turned off, it might indicate hardware problem, and in this case - just speculating, due to lack of knowledge of your system - perhaps inadequate power to the PCI-e bus. What PSU are you running?
  • RyanFodderRyanFodder Detroit, MI
    edited June 2009
    I am only running the two 9800 GTX+ cards. I have a 750 watt PSU. Both cores are now giving me the EUE error. :/

    I am still dreaming of this 10,000 ppd system... so far, I only seem to be able to get about 6k, on a good day :(
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited June 2009
    I am only running the two 9800 GTX+ cards. I have a 750 watt PSU. Both cores are now giving me the EUE error. :/

    It might be an overheating problem. Are these EUEs on different projects? Also, EUE usually means the client has processed the project at least partially. Is that correct in this case?
  • RyanFodderRyanFodder Detroit, MI
    edited June 2009
    The EUE error means (according to what I read) that it has seen the error 5 times, and is pausing for 24 hours.

    The error I get on the other core is:
    [12:20:39] Project: 5772 (Run 12, Clone 144, Gen 256)
    [12:20:39]
    [12:20:39] Assembly optimizations on if available.
    [12:20:39] Entering M.D.
    [12:20:46] Working on Protein
    [12:20:46] Client config found, loading data.
    [12:20:46] mdrun_gpu returned
    [12:20:46] SHAKE violations on GPU
    [12:20:46]
    [12:20:46] Folding@home Core Shutdown: UNSTABLE_MACHINE
    [12:20:50] CoreStatus = 7A (122)
    [12:20:50] Sending work to server
    [12:20:50] Project: 5772 (Run 12, Clone 144, Gen 256)
    [12:20:50] - Read packet limit of 540015616... Set to 524286976.
    [12:20:50] - Error: Could not get length of results file work/wuresults_02.dat
    [12:20:50] - Error: Could not read unit 02 file. Removing from queue.
    [12:20:50] EUE limit exceeded. Pausing 24 hours.
  • _k_k P-Town, Texas
    edited June 2009
    Kick the fans up to 100% and let it run, since I think you said clocks are stock both cards.
  • RyanFodderRyanFodder Detroit, MI
    edited June 2009
    The temps I'm seeing are at 55 on one core, and 62 on the other core. (when both are running, which is rare:/) But I will do that. I also have a side fan that blows directly on both GPUs. I can shut that off, and turn up the GPUs? (if it is a literal shaking problem)
  • RyanFodderRyanFodder Detroit, MI
    edited June 2009
    I get the following code (see attachment) when I run both cores, with fans set at 100% (stock clock settings.)

    The other core is working, and is getting about 5500 ppd on its current run.

    Temps are 55 C and 38C (working, and not working, respectively.)

    This folding business is getting a bit frustrating :D

    edit: I'm running vista x64 as well.
  • RyanFodderRyanFodder Detroit, MI
    edited June 2009
    Sorry for all the posts, but I figured I'd share what I found. Turns out, you have to set each of the cores to run in XP SP2 compatibility mode, as well as run as admin to solve the issues I was seeing.

    I am currently getting ~4000/~2500 out of my cards running at 56C/38C respectively. Why would I see such a lag in performance and temperature if both cores are running approximately the same point value/size WU? Why the huge difference in temperature? Both are seeing relatively the same airflow...
  • _k_k P-Town, Texas
    edited June 2009
    You can edit your post fyi, just mark what you edit. I always see a big temp delta between my SLi rig. As for the ppd diff it could be a lot of things. Processes you have running, apps that are launched, some times rigs and clients need to be restarted. Do like you would on a lonely saturday night play with it.
  • RyanFodderRyanFodder Detroit, MI
    edited June 2009
    After a long trial and many forum searches, I found that I needed to underclock (boo!) my video cards a bit. They are currently set at 700/1741 and 700/1400 (stock is 738/18xx) for the core and shader clocks. Both cards now do 4-5k each.

    I am getting 10-12k per day now :) Thanks for all the help everyone!
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited June 2009
    After a long trial and many forum searches, I found that I needed to underclock (boo!) my video cards a bit.

    It happens, and it's nothing you've done wrong, especially considering those modest core temperatures. I'm running several video cards in GPU2 Folding, all of the same specification GPUs. Some I can overclock the shader clocks by close to 20%, but others I must underclock. :eek3:

    Thanks for your persistence in getting things running and reporting your solution.
Sign In or Register to comment.