258.96 -- no disabling sli?

TushonTushon I'm scared, CoachAlexandria, VA
edited August 2010 in Folding@Home
So I bought a 9800 GX2 off a certain folding pusher, you know, to boost my epeen. I got the card installed and everything was hunky-dory. I installed the latest drivers (for the lack of dummy plug), and as of now I have been able to complete 3 WUs through a combination of weird factors. I know one of the issues is that the ability to disable SLI is not present in the menu (going to test tomorrow if this is due to booting off the onboard ATI video chipset, but I saw several reports on nvidia's site saying other people had the same issue - like they simply removed the menu option).

Will update after I test with booting from the card instead of onboard, but any other suggestions would be lovely

Specs:
Athlon II x2 240 (2.8GHz)
Gigabyte GA-M1785GM-US2H
4GBs ram
9800GX2 (at stock 1500MHz for now)
Windows 7 x64 (legit enterprise key)
«1

Comments

  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited August 2010
    Yes, you can disable SLI. In 258.96, in the Nvidia Control Panel, go to 3D Settings. Select "Disable multi-GPU mod."

    Tushon, have you set one client each (2 total) for each GPU in the GX2?
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    I had one set up for each (both a program files folder and roaming data folder). Under 3D settings, I don't see an option called "Disable multi-GPU mod" as a "Feature" or a selection under any of the settings. I really am ready to be proven stupid on this (or find out that it is simply due to my booting off the ATI onboard video ... tomorrow)
  • Sledgehammer70Sledgehammer70 California
    edited August 2010
    Under "Manage 3D settings" Make sure "CUDA - GPUs" is set to ALL & under the "Set SLI and PhysX configuration" select "Disable SLI"

    If you don't see these options than uninstall old driver and do a fresh install.

    attachment.php?attachmentid=28482&stc=1&d=1281247331

    attachment.php?attachmentid=28483&stc=1&d=1281247331
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited August 2010
    Sledge, I think some of the Nvidia Control Panel options and views may be different for the GTX 480 than non-Fermi cards. I have clean installations of 258.96 on all my Folding rigs (each dual GTX 295 machines). None of the NV Control Panels display "SLI Configuration" option in your screen shot. This is what options I have:
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited August 2010
    Under "3D Settings," there is no "Set SLI and PhsyX Configuration" option.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    What Leo said. I do have CUDA set to all, but they "removed" the Set SLI screen for some reason.

    UPDATE: After disabling the onboard ATI graphics and booting from the card, the "setup SLI" screen returned. I really hope that Nvidia made some strange quirk rather than a purposeful manipulation. Working on getting the clients running now.
  • Sledgehammer70Sledgehammer70 California
    edited August 2010
    In your case it is Multi GPU mode.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    So the clients both ran fine (I made shortcuts on the desktop for each one (copied both the Program files and roaming data folder and made the shortcuts link to the appropriate one), gave each one a -gpu 0 or -gpu 1 argument and it ran through the first WU fine and reported it then failed on the second with error code 63 (99). If I then close the client and restart it, the client will not launch, citing either an unsupported GPU or not current drivers (lolwut?). Advice?
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited August 2010
    1. In the GPU client configuration, did you set each client for a different "Machine ID"?

    2. Also, did you set the flag "-gpu 3 -forcegpu nvidia_g80" in the startup icon properties for each client?

    3. Where did you put the client folders?
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    1) I did set each one for a different client ID (not 100% sure about the last time but the previous several I know I did)

    2) I did not use that flag in the shortcut. I thought it was supposed to be -gpu 0, and -gpu 1 only. So should it be the flags you stated for both or should the "first" one be "-gpu 0 -forcegpu nvidia_g80" and "-gpu 1 -forcegpu nvidia_g80" for the second?

    3) The client folders were left in Program Files and the working folders were in roaming data
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited August 2010
    The first GPU client should be "-gpu 0 -forcegpu nvidia_g80".
    The second GPU client should be "-gpu 1 -forcegpu nvidia_g80".

    and so on...

    And ensure also to set different machine IDs in the GPU client configuration.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    I made sure to set different machine IDs than any I had used prior and made all of the appropriate changes to the target lines. Clients are running now, lets see if they make it past the first round of WUs
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    Clients made it through 2 rounds on one of the GPUs and 1 on the other, then errored out and went to sleep. Not functional after a reboot either.
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited August 2010
    Tushon wrote:
    Clients made it through 2 rounds on one of the GPUs and 1 on the other, then errored out and went to sleep. Not functional after a reboot either.

    One client errored out, or both?

    Are you monitoring GPU temperatures?
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    Both clients are running now. Temps are solid at 75-76. Fan at 86%. They are ~30% through the WU, so I'll check back on them in a couple hours.

    Update: both completed the first WU and picked up a second, temps still stable at 75-76 and those will go down once i cut a fan to fit in the side wall (the CPU heatsink blocks the top corner, a truly unnecessary corner). Monitoring for any further incidence. Now to see if I can replace a socket in one of the walls because (it is a kitchen circuit) it keeps tripping the mini breaker and turning off my main computer. I may just buy an extension cord and run it from the living room.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    Final update:

    Disabling the ATI onboard was the answer to regaining the SLI config page. I think thats not very nice of Nvidia to program but they have done other mean things in the past regarding interoperability with ATI cards.

    After a reimage to Windows Server 08 last night, the card is back up and running with temps stable at 73 (i have a fan pulling air away from the card out the side vent). It ran all night without problems to my knowledge. Now to figure out domain stuff and permissions, etc ... and make my 5770s stop crashing anytime I launch a game.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    Ugggghhhhhhhh. The card started failing last night then worked through one WU after a restart and failed again. Same core status as before. Max temp was 77, no OC.
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited August 2010
    Your core temps are just fine, not even close to overheating.

    It is a possibility that other components on the PCBs (cards on which the GPUs are affixed) are getting too hot. It is possible that some of the factory TIM (thermal interface material) has cracked and become displaced. Folding is very, very demanding on video cards.

    Do have experience applying thermal paste to chips/processors?

    Also, you left the most important component out when you listed your specs - the power supply unit. What are you using?

    It would also help if you would copy and paste a passage from the FAHlog at the point where a work unit failed.

    Example (except this one does not log a failed unit):
    [04:28:23] Completed 96%
    [04:29:00] Completed 97%
    [04:29:37] Completed 98%
    [04:30:13] Completed 99%
    [04:30:51] Completed 100%
    [04:30:51] Successful run
    [04:30:51] DynamicWrapper: Finished Work Unit: sleep=10000
    [04:31:01] Reserved 75808 bytes for xtc file; Cosm status=0
    [04:3
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    PSU = OCZ550FTY
    550W, modular. brand new

    I have put thermal paste on CPUs several times with no problems, but I have never taken apart a vid card, but it can't be "that hard". I may attempt that after the results of the following:

    On _k_'s advice, I tried going back to GPU2 and it will not start a single WU. Fails with

    Folding@home Core Shutdown: UNSTABLE_MACHINE
    CoreStatus = 7A (122)
    </pre>every time I start it. using flags -gpu 0 -forcegpu nvidia_g80, machine id 2 and the appropriate stuff for the other

    The GPU3 logs were already gone and I hadn't been able to get a new one working yet to get the sample you requested.
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited August 2010
    Are both clients showing UNSTABLE MACHINE?

    If both clients are are erroring out, it sounds like a software problem - drivers and/or client installations. If it were only 1 client out of 2, it could probably attributed to a hardware problem.

    I would consider removing the drivers completely and removing the GPU clients completely and and restarting from scratch.
    I have put thermal paste on CPUs several times with no problems, but I have never taken apart a vid card, but it can't be "that hard". I may attempt that after the results of the following:
    The 9800GX2 is a complex video card that is difficult to disassemble and service. But not to worry, at this point, I'm not ready to call this a hardware problem.
  • _k_k P-Town, Texas
    edited August 2010
    It shouldn't be a hardware failure, this went from my main rig to his server.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    I also feel like I should mention that I reimaged this box to test Windows Server2008 R2 and hope that shouldnt be affecting my ability to fold.

    Edit: Checked, server shouldn't be related to folding problems.

    I uninstalled driver and folding, deleted all the pertinent folders, restarted, reinstalled 258.96 driver, restarted, installed GPU3 folding, copied folders, created appropriate shortcuts, used the config and additional options to add "-gpu N -forcegpu nvidia_g80" and set the machine IDs. Both are folding first WU, we'll see what happens in a while when they try to get round two. The odd thing is that it worked for a whole night the other day then stopped again. Sad day, it was.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    Second update: both of them are merrily clicking away on the second round of GPU3 WUs. I do wonder if they will survive a reboot. Any possibility that it would be related to just shutting down without pausing the clients?
  • _k_k P-Town, Texas
    edited August 2010
    Something simple to check but painful if it the clients don't restart is to reboot the computer without shutting down the clients then when everything comes back up see if they launch. Hopefully they do, but something important is if they start back up from the last checkpoint or dropped the WU they were working on. If they drop the WU and start a new one that points to the clients not putting their work in the correct place or an issue related to their install.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    So one of them stopped working and the other is still chugging along. Here is the log from the stopped one. I already tried restarting it (nothing but CoreStatus = 63 (99)), clearing the target folder (deleted work folder, queue.dat, myfolding.html, unitinfo.txt, the folding core ... basically, left client.cfg and .dlls), etc. I haven't restarted the whole computer yet, because Id rather have one working than none :(

    Thoughts?

    Edit: and the second one is also down.
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska
    edited August 2010
    "FahCore_15"

    Your GPU(s) is failing to process a work unit with the GPU3 client. There are reports of others with 8800/9800 series cards having that problem.

    GPU3 is open beta status. You might prefer Folding with GPU2. I'm still using GPU2 and will not move to GPU3 until '2 is no longer available.

    Take a look at this page: http://foldingforum.org/viewtopic.php?f=59&t=14683&p=144648#p144648
  • _k_k P-Town, Texas
    edited August 2010
    ^Yup, this is kind of why I recommended that you roll back to GPU2 with the installs since you are having problems.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    I'm going to laugh if this was basically caused by me copying a subfolder instead of the "head" folder, though that still doesn't make sense.

    Took both of your advice and did removal, driver sweeper, re install, install console clients of GPU2, both working first WU now
  • _k_k P-Town, Texas
    edited August 2010
    Tushon wrote:
    I'm going to laugh if this was basically caused by me copying a subfolder instead of the "head" folder, though that still doesn't make sense.

    Just to make sure there isn't any confusion with how to set up the folders.
  • TushonTushon I'm scared, Coach Alexandria, VA
    edited August 2010
    Yeah I had separate folders before (within the folding@home folder) but I went and made sure to just copy the "top" level folders this time. No dice. After the first WU, both errored out with the 7A (122) status. I can upload the logs I have if that would be useful. If those logs dont have the end of the successful WU, I'll go through the pain of driver cleans again to get at least one WU complete.
Sign In or Register to comment.