SMP MISSING_WORK_FILES

_k_k P-Town, Texas Icrontian
edited January 2009 in Folding@Home
Once yesterday and just today while I was away I got stop errors on my SMP with MISSING_WORK_FILES though this happens in the middle of a run. I have just been deleting work and restarting. Not sure and I don't like the FF, any guesses or anyone seen the same.

[13:51:57] Folding@Home Gromacs SMP Core
[13:51:57] Version 1.74 (March 10, 2007)
[13:51:57]
[13:51:57] Preparing to commence simulation
[13:51:57] - Ensuring status. Please wait.
[13:52:02] - Starting from initial work packet
[13:52:02]
[13:52:02] Project: 2665 (Run 2, Clone 948, Gen 82)
[13:52:02]
[13:52:02] Assembly optimizations on if available.
[13:52:02] Entering M.D.
[13:52:26] percent)
[13:52:27] - Starting from initial work packet
[13:52:27]
[13:52:27] Project: 2665 (Run 2, Clone 948, Gen 82)
[13:52:27]
[13:52:28] Entering M.D.
[13:52:34] Rejecting checkpoint
[13:52:36] Protein: HGG with glycosylations
[13:52:36] Writing local files
[13:52:42] Extra SSE boost OK.
[13:52:42] Writing local files
[13:52:43] Completed 0 out of 250000 steps (0 percent)
[14:05:38] Writing local files
[14:05:38] Completed 2500 out of 250000 steps (1 percent)
[14:18:31] Writing local files
[14:18:31] Completed 5000 out of 250000 steps (2 percent)
[14:31:38] Writing local files
[14:31:38] Completed 7500 out of 250000 steps (3 percent)
[14:44:38] Writing local files
[14:44:39] Completed 10000 out of 250000 steps (4 percent)
[14:57:35] Writing local files
[14:57:35] Completed 12500 out of 250000 steps (5 percent)
[15:10:32] Writing local files
[15:10:33] Completed 15000 out of 250000 steps (6 percent)
[15:23:34] Writing local files
[15:23:35] Completed 17500 out of 250000 steps (7 percent)
[15:36:29] Writing local files
[15:36:29] Completed 20000 out of 250000 steps (8 percent)
[15:49:26] Writing local files
[15:49:27] Completed 22500 out of 250000 steps (9 percent)
[16:02:21] Writing local files
[16:02:21] Completed 25000 out of 250000 steps (10 percent)
[16:15:17] Writing local files
[16:15:18] Completed 27500 out of 250000 steps (11 percent)
[16:28:14] Writing local files
[16:28:14] Completed 30000 out of 250000 steps (12 percent)
[16:41:11] Writing local files
[16:41:12] Completed 32500 out of 250000 steps (13 percent)
[16:54:10] Writing local files
[16:54:10] Completed 35000 out of 250000 steps (14 percent)
[17:07:09] Writing local files
[17:07:09] Completed 37500 out of 250000 steps (15 percent)
[17:08:50] Gromacs cannot continue further.
[17:08:50] Going to send back what have done.
[17:08:50] logfile size: 37505
[17:08:50] - Writing 38041 bytes of core data to disk...
[17:08:50] ... Done.
[17:10:50]
[17:10:50] Folding@home Core Shutdown: EARLY_UNIT_END
[17:10:50]
[17:10:50] Folding@home Core Shutdown: EARLY_UNIT_END
[17:10:54] CoreStatus = 7B (123)
[17:10:54] Client-core communications error: ERROR 0x7b
[17:10:54] This is a sign of more serious problems, shutting down.


--- Opening Log file [January 4 17:11:28 UTC]


# Windows SMP Console Edition #################################################
###############################################################################

Folding@Home Client Version 6.22 SMP Beta2

http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Program Files (x86)\Folding@Home Windows SMP Client V1.01
Executable: C:\Program Files (x86)\Folding@Home Windows SMP Client V1.01\Folding@home-Win32-x86.exe
Arguments: -smp

[17:11:28] - Ask before connecting: No
[17:11:28] - User name: _k_ (Team 93)
[17:11:28] - User ID: 5EEFEC2502C64B52
[17:11:28] - Machine ID: 1
[17:11:28]
[17:11:28] Loaded queue successfully.
[17:11:28]
[17:11:28] + Processing work unit
[17:11:28] Work type a1 not eligible for variable processors
[17:11:28] Core required: FahCore_a1.exe
[17:11:28] Core found.
[17:11:28] Using generic mpiexec calls
[17:11:28] Working on queue slot 02 [January 4 17:11:28 UTC]
[17:11:28] + Working ...
[17:11:28]
[17:11:28] *
*
[17:11:28] Folding@Home Gromacs SMP Core
[17:11:28] Version 1.74 (March 10, 2007)
[17:11:28]
[17:11:28] Preparing to commence simulation
[17:11:28] - Ensuring status. Please wait.
[17:11:45] - Looking at optimizations...
[17:11:45] - Working with standard loops on this execution.
[17:11:45] - Previous termination of core was improper.
[17:11:45] - Going to use standard loops.
[17:11:45] - Files status OK
[17:13:45]
[17:13:45] Folding@home Core Shutdown: MISSING_WORK_FILES
[17:13:45] Finalizing output
[17:13:48] CoreStatus = 1 (1)
[17:13:48] Client-core communications error: ERROR 0x1
[17:13:48] This is a sign of more serious problems, shutting down.

Folding@Home Client Shutdown at user request.

Folding@Home Client Shutdown.

Comments

  • mas0nmas0n howdy Icrontian
    edited January 2009
    I am willing to bet your OC is not 100% stable or you have bad RAM.

    http://fahwiki.net/index.php/CoreStatus_codes#7B

    Also,
    _k_ wrote:
    I don't like the FF
    What, why?
  • _k_k P-Town, Texas Icrontian
    edited January 2009
    This is the same OC I have been running for months.
  • mas0nmas0n howdy Icrontian
    edited January 2009
    And?

    Run memtest.

    Since there is a possibility of hardware being at fault it makes no sense to start troubleshooting elsewhere.
  • SnarkasmSnarkasm Madison, WI Icrontian
    edited January 2009
    Clearly the only solution is to buy a new computer. Get a nice shiny Dell, I hear those are reliable and good for folding.


    Disclaimer: Sarcasm, kids! Don't fold on - or for that matter, buy - Dells!
  • _k_k P-Town, Texas Icrontian
    edited January 2009
    Well it runs OCCT for 6 hours and 40 minutes with no errors and a non-bootable memtest doesn't produce errors....this is dumb

    Just reinstalled SMP.
  • mas0nmas0n howdy Icrontian
    edited January 2009
    1. Update to the 6.23 binary (not sure how I missed this)
    2. Make sure the directory you have SMP installed to is in your security software's exclusion list
    3. check Event Viewer for networking or disk errors
    4. disable Windows Automatic Updates
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited January 2009
    disable Windows Automatic Updates
    Your advice or Pande Group's?

    There were two time periods when I had problems such as you detail, K. One was a networking problem and the other was hardware.

    1. Computer ran a stable overclock and folded SMP units near perfectly for months, then one day it started destroying SMP units. Cause: memory module gone bad. Removed bad module and SMP units immediately started processing properly again. Installed the RMA replacement and work units continued to complete normally.

    2. When running dual SMP clients on quad core machines, the clients frequently got the client-core communication error messages, amongst other SMP processing failures. Cause: normal home networking variances that Windows SMP can be sensitive to. Fix: assigned static IPs to all networked computers. Others have fixed the problem by installing Microsoft Loopback Adapter. It's a virtual adapter that's already contained in the Windows code; just needs to be installed.
  • mas0nmas0n howdy Icrontian
    edited January 2009
    Leonardo wrote:
    Your advice or Pande Group's?

    I've had SMP unit's give EUE at the same time as Windows Automatic Updates began installing updates on builds where I had forgotten to disable them. I've also seen this recommended on foldingforum.org, although haven't seen it directly attributed to a fix before (hence it being #4)
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited January 2009
    OK, thanks. More knowledge to tuck away and dispense when needed. thanks. I've always either had auto updates disable or set to only notify me anyway, so never had a problem with it.
  • _k_k P-Town, Texas Icrontian
    edited January 2009
    I increased vCore by to intervals and updated to new 6.23 and get the same issue. I am getting left over FahCore_a1.exe when the window closes as well. I am getting really pissed off how all of a sudden the SMP decides its not going to contiune like the two GPU2 clients.
  • _k_k P-Town, Texas Icrontian
    edited January 2009
    Ok I updated to 6.23 binary, its killing my proc more. The computer is back to stuttering when I type. It ran all night without the -smp argument in there and it was still alive I just switched it over to SMP mode and we will see.
  • mas0nmas0n howdy Icrontian
    edited January 2009
    If the issue is not resolved I would drop the entire machine down to stock settings and test it out. If you continue to have issues at stock settings I think it's time to take this to foldingforums.org and see what they have to say about it.

    Just a thought, have you re-run install.bat?
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited January 2009
    Just a thought, have you re-run install.bat?
    Yes. Next time the client bombs, stop all SMP processes, delete the core and re-run install.bat.
Sign In or Register to comment.