SMP MISSING_WORK_FILES
_k
P-Town, Texas Icrontian
Once yesterday and just today while I was away I got stop errors on my SMP with MISSING_WORK_FILES though this happens in the middle of a run. I have just been deleting work and restarting. Not sure and I don't like the FF, any guesses or anyone seen the same.
[13:51:57] Folding@Home Gromacs SMP Core
[13:51:57] Version 1.74 (March 10, 2007)
[13:51:57]
[13:51:57] Preparing to commence simulation
[13:51:57] - Ensuring status. Please wait.
[13:52:02] - Starting from initial work packet
[13:52:02]
[13:52:02] Project: 2665 (Run 2, Clone 948, Gen 82)
[13:52:02]
[13:52:02] Assembly optimizations on if available.
[13:52:02] Entering M.D.
[13:52:26] percent)
[13:52:27] - Starting from initial work packet
[13:52:27]
[13:52:27] Project: 2665 (Run 2, Clone 948, Gen 82)
[13:52:27]
[13:52:28] Entering M.D.
[13:52:34] Rejecting checkpoint
[13:52:36] Protein: HGG with glycosylations
[13:52:36] Writing local files
[13:52:42] Extra SSE boost OK.
[13:52:42] Writing local files
[13:52:43] Completed 0 out of 250000 steps (0 percent)
[14:05:38] Writing local files
[14:05:38] Completed 2500 out of 250000 steps (1 percent)
[14:18:31] Writing local files
[14:18:31] Completed 5000 out of 250000 steps (2 percent)
[14:31:38] Writing local files
[14:31:38] Completed 7500 out of 250000 steps (3 percent)
[14:44:38] Writing local files
[14:44:39] Completed 10000 out of 250000 steps (4 percent)
[14:57:35] Writing local files
[14:57:35] Completed 12500 out of 250000 steps (5 percent)
[15:10:32] Writing local files
[15:10:33] Completed 15000 out of 250000 steps (6 percent)
[15:23:34] Writing local files
[15:23:35] Completed 17500 out of 250000 steps (7 percent)
[15:36:29] Writing local files
[15:36:29] Completed 20000 out of 250000 steps (8 percent)
[15:49:26] Writing local files
[15:49:27] Completed 22500 out of 250000 steps (9 percent)
[16:02:21] Writing local files
[16:02:21] Completed 25000 out of 250000 steps (10 percent)
[16:15:17] Writing local files
[16:15:18] Completed 27500 out of 250000 steps (11 percent)
[16:28:14] Writing local files
[16:28:14] Completed 30000 out of 250000 steps (12 percent)
[16:41:11] Writing local files
[16:41:12] Completed 32500 out of 250000 steps (13 percent)
[16:54:10] Writing local files
[16:54:10] Completed 35000 out of 250000 steps (14 percent)
[17:07:09] Writing local files
[17:07:09] Completed 37500 out of 250000 steps (15 percent)
[17:08:50] Gromacs cannot continue further.
[17:08:50] Going to send back what have done.
[17:08:50] logfile size: 37505
[17:08:50] - Writing 38041 bytes of core data to disk...
[17:08:50] ... Done.
[17:10:50]
[17:10:50] Folding@home Core Shutdown: EARLY_UNIT_END
[17:10:50]
[17:10:50] Folding@home Core Shutdown: EARLY_UNIT_END
[17:10:54] CoreStatus = 7B (123)
[17:10:54] Client-core communications error: ERROR 0x7b
[17:10:54] This is a sign of more serious problems, shutting down.
--- Opening Log file [January 4 17:11:28 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
Folding@Home Client Version 6.22 SMP Beta2
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\Program Files (x86)\Folding@Home Windows SMP Client V1.01
Executable: C:\Program Files (x86)\Folding@Home Windows SMP Client V1.01\Folding@home-Win32-x86.exe
Arguments: -smp
[17:11:28] - Ask before connecting: No
[17:11:28] - User name: _k_ (Team 93)
[17:11:28] - User ID: 5EEFEC2502C64B52
[17:11:28] - Machine ID: 1
[17:11:28]
[17:11:28] Loaded queue successfully.
[17:11:28]
[17:11:28] + Processing work unit
[17:11:28] Work type a1 not eligible for variable processors
[17:11:28] Core required: FahCore_a1.exe
[17:11:28] Core found.
[17:11:28] Using generic mpiexec calls
[17:11:28] Working on queue slot 02 [January 4 17:11:28 UTC]
[17:11:28] + Working ...
[17:11:28]
[17:11:28] *
*
[17:11:28] Folding@Home Gromacs SMP Core
[17:11:28] Version 1.74 (March 10, 2007)
[17:11:28]
[17:11:28] Preparing to commence simulation
[17:11:28] - Ensuring status. Please wait.
[17:11:45] - Looking at optimizations...
[17:11:45] - Working with standard loops on this execution.
[17:11:45] - Previous termination of core was improper.
[17:11:45] - Going to use standard loops.
[17:11:45] - Files status OK
[17:13:45]
[17:13:45] Folding@home Core Shutdown: MISSING_WORK_FILES
[17:13:45] Finalizing output
[17:13:48] CoreStatus = 1 (1)
[17:13:48] Client-core communications error: ERROR 0x1
[17:13:48] This is a sign of more serious problems, shutting down.
Folding@Home Client Shutdown at user request.
Folding@Home Client Shutdown.
[13:51:57] Folding@Home Gromacs SMP Core
[13:51:57] Version 1.74 (March 10, 2007)
[13:51:57]
[13:51:57] Preparing to commence simulation
[13:51:57] - Ensuring status. Please wait.
[13:52:02] - Starting from initial work packet
[13:52:02]
[13:52:02] Project: 2665 (Run 2, Clone 948, Gen 82)
[13:52:02]
[13:52:02] Assembly optimizations on if available.
[13:52:02] Entering M.D.
[13:52:26] percent)
[13:52:27] - Starting from initial work packet
[13:52:27]
[13:52:27] Project: 2665 (Run 2, Clone 948, Gen 82)
[13:52:27]
[13:52:28] Entering M.D.
[13:52:34] Rejecting checkpoint
[13:52:36] Protein: HGG with glycosylations
[13:52:36] Writing local files
[13:52:42] Extra SSE boost OK.
[13:52:42] Writing local files
[13:52:43] Completed 0 out of 250000 steps (0 percent)
[14:05:38] Writing local files
[14:05:38] Completed 2500 out of 250000 steps (1 percent)
[14:18:31] Writing local files
[14:18:31] Completed 5000 out of 250000 steps (2 percent)
[14:31:38] Writing local files
[14:31:38] Completed 7500 out of 250000 steps (3 percent)
[14:44:38] Writing local files
[14:44:39] Completed 10000 out of 250000 steps (4 percent)
[14:57:35] Writing local files
[14:57:35] Completed 12500 out of 250000 steps (5 percent)
[15:10:32] Writing local files
[15:10:33] Completed 15000 out of 250000 steps (6 percent)
[15:23:34] Writing local files
[15:23:35] Completed 17500 out of 250000 steps (7 percent)
[15:36:29] Writing local files
[15:36:29] Completed 20000 out of 250000 steps (8 percent)
[15:49:26] Writing local files
[15:49:27] Completed 22500 out of 250000 steps (9 percent)
[16:02:21] Writing local files
[16:02:21] Completed 25000 out of 250000 steps (10 percent)
[16:15:17] Writing local files
[16:15:18] Completed 27500 out of 250000 steps (11 percent)
[16:28:14] Writing local files
[16:28:14] Completed 30000 out of 250000 steps (12 percent)
[16:41:11] Writing local files
[16:41:12] Completed 32500 out of 250000 steps (13 percent)
[16:54:10] Writing local files
[16:54:10] Completed 35000 out of 250000 steps (14 percent)
[17:07:09] Writing local files
[17:07:09] Completed 37500 out of 250000 steps (15 percent)
[17:08:50] Gromacs cannot continue further.
[17:08:50] Going to send back what have done.
[17:08:50] logfile size: 37505
[17:08:50] - Writing 38041 bytes of core data to disk...
[17:08:50] ... Done.
[17:10:50]
[17:10:50] Folding@home Core Shutdown: EARLY_UNIT_END
[17:10:50]
[17:10:50] Folding@home Core Shutdown: EARLY_UNIT_END
[17:10:54] CoreStatus = 7B (123)
[17:10:54] Client-core communications error: ERROR 0x7b
[17:10:54] This is a sign of more serious problems, shutting down.
--- Opening Log file [January 4 17:11:28 UTC]
# Windows SMP Console Edition #################################################
###############################################################################
Folding@Home Client Version 6.22 SMP Beta2
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: C:\Program Files (x86)\Folding@Home Windows SMP Client V1.01
Executable: C:\Program Files (x86)\Folding@Home Windows SMP Client V1.01\Folding@home-Win32-x86.exe
Arguments: -smp
[17:11:28] - Ask before connecting: No
[17:11:28] - User name: _k_ (Team 93)
[17:11:28] - User ID: 5EEFEC2502C64B52
[17:11:28] - Machine ID: 1
[17:11:28]
[17:11:28] Loaded queue successfully.
[17:11:28]
[17:11:28] + Processing work unit
[17:11:28] Work type a1 not eligible for variable processors
[17:11:28] Core required: FahCore_a1.exe
[17:11:28] Core found.
[17:11:28] Using generic mpiexec calls
[17:11:28] Working on queue slot 02 [January 4 17:11:28 UTC]
[17:11:28] + Working ...
[17:11:28]
[17:11:28] *
*
[17:11:28] Folding@Home Gromacs SMP Core
[17:11:28] Version 1.74 (March 10, 2007)
[17:11:28]
[17:11:28] Preparing to commence simulation
[17:11:28] - Ensuring status. Please wait.
[17:11:45] - Looking at optimizations...
[17:11:45] - Working with standard loops on this execution.
[17:11:45] - Previous termination of core was improper.
[17:11:45] - Going to use standard loops.
[17:11:45] - Files status OK
[17:13:45]
[17:13:45] Folding@home Core Shutdown: MISSING_WORK_FILES
[17:13:45] Finalizing output
[17:13:48] CoreStatus = 1 (1)
[17:13:48] Client-core communications error: ERROR 0x1
[17:13:48] This is a sign of more serious problems, shutting down.
Folding@Home Client Shutdown at user request.
Folding@Home Client Shutdown.
0
Comments
http://fahwiki.net/index.php/CoreStatus_codes#7B
Also, What, why?
Run memtest.
Since there is a possibility of hardware being at fault it makes no sense to start troubleshooting elsewhere.
Disclaimer: Sarcasm, kids! Don't fold on - or for that matter, buy - Dells!
Just reinstalled SMP.
2. Make sure the directory you have SMP installed to is in your security software's exclusion list
3. check Event Viewer for networking or disk errors
4. disable Windows Automatic Updates
There were two time periods when I had problems such as you detail, K. One was a networking problem and the other was hardware.
1. Computer ran a stable overclock and folded SMP units near perfectly for months, then one day it started destroying SMP units. Cause: memory module gone bad. Removed bad module and SMP units immediately started processing properly again. Installed the RMA replacement and work units continued to complete normally.
2. When running dual SMP clients on quad core machines, the clients frequently got the client-core communication error messages, amongst other SMP processing failures. Cause: normal home networking variances that Windows SMP can be sensitive to. Fix: assigned static IPs to all networked computers. Others have fixed the problem by installing Microsoft Loopback Adapter. It's a virtual adapter that's already contained in the Windows code; just needs to be installed.
I've had SMP unit's give EUE at the same time as Windows Automatic Updates began installing updates on builds where I had forgotten to disable them. I've also seen this recommended on foldingforum.org, although haven't seen it directly attributed to a fix before (hence it being #4)
Just a thought, have you re-run install.bat?