FAH Early Unit End
nonstop301
51° 27' 24.87" N // 0° 11' 38.91" W Member
I'm running into some trouble when I receive assignments that involve p1499 or p3039.
With p1499 I get a cycle of the following error code :
[06:57:09] Loaded queue successfully.
[06:57:09] + Benchmarking ...
[06:57:11]
[06:57:11] + Processing work unit
[06:57:11] Core required: FahCore_78.exe
[06:57:11] Core found.
[06:57:11] Working on Unit 02 [January 19 06:57:11]
[06:57:11] + Working ...
[06:57:11]
[06:57:11] *
*
[06:57:11] Folding@Home Gromacs Core
[06:57:11] Version 1.90 (March 8, 2006)
[06:57:11]
[06:57:11] Preparing to commence simulation
[06:57:11] - Looking at optimizations...
[06:57:11] - Files status OK
[06:57:13] - Expanded 857201 -> 12361133 (decompressed 1442.0 percent)
[06:57:13]
[06:57:13] Project: 1499 (Run 598, Clone 0, Gen 8)
[06:57:13]
[06:57:14] Assembly optimizations on if available.
[06:57:14] Entering M.D.
[06:57:21] Protein: p1499_tet_1499
[06:57:21]
[06:57:21] Writing local files
[06:57:21] Gromacs error.
[06:57:21]
[06:57:21] Folding@home Core Shutdown: UNKNOWN_ERROR
[06:57:25] CoreStatus = 79 (121)
[06:57:25] Client-core communications error: ERROR 0x79
[06:57:25] Deleting current work unit & continuing...
[06:57:45] - Preparing to get new work unit...
[06:57:45] + Attempting to get work packet
[06:57:45] - Connecting to assignment server
[06:57:46] - Successful: assigned to (171.64.122.134).
[06:57:46] + News From Folding@Home: Welcome to Folding@Home
[06:57:46] Loaded queue successfully.
[06:58:01] + Closed connections
[06:58:06]
[06:58:06] + Processing work unit
[06:58:06] Core required: FahCore_78.exe
[06:58:06] Core found.
[06:58:06] Working on Unit 03 [January 19 06:58:06]
[06:58:06] + Working ...
[06:58:06]
[06:58:06] *
*
[06:58:06] Folding@Home Gromacs Core
[06:58:06] Version 1.90 (March 8, 2006)
[06:58:06]
[06:58:06] Preparing to commence simulation
[06:58:06] - Looking at optimizations...
[06:58:06] - Created dyn
[06:58:06] - Files status OK
[06:58:09] - Expanded 857201 -> 12361133 (decompressed 1442.0 percent)
[06:58:09] - Starting from initial work packet
[06:58:09]
[06:58:09] Project: 1499 (Run 598, Clone 0, Gen 8)
[06:58:09]
[06:58:09] Assembly optimizations on if available.
[06:58:09] Entering M.D.
[06:58:16] Protein: p1499_tet_1499
[06:58:16]
[06:58:16] Writing local files
[06:58:16] Gromacs error.
[06:58:16]
[06:58:16] Folding@home Core Shutdown: UNKNOWN_ERROR
[06:58:20] CoreStatus = 79 (121)
[06:58:20] Client-core communications error: ERROR 0x79
[06:58:20] Deleting current work unit & continuing...
And with p3039 I received the following after 20 steps into the Work Unit
[17:25:00] Completed 1000000 out of 5000000 steps (20)
[17:37:48] Quit 101 - Fatal error: NaN detected: (ener[13])
[17:37:48]
[17:37:48] Simulation instability has been encountered. The run has entered a
[17:37:48] state from which no further progress can be made.
[17:37:48] This may be the correct result of the simulation, however if you
[17:37:48] often see other project units terminating early like this
[17:37:48] too, you may wish to check the stability of your computer (issues
[17:37:48] such as high temperature, overclocking, etc.).
[17:37:48] Going to send back what have done.
[17:37:48] logfile size: 32292
[17:37:48] - Writing 32855 bytes of core data to disk...
[17:37:48] ... Done.
[17:37:48]
[17:37:48] Folding@home Core Shutdown: EARLY_UNIT_END
[17:37:53] CoreStatus = 72 (114)
[17:37:53] Sending work to server
I have no idea why these errors occur but I did complete a p2125 assignment in between and now I received another p2125.
I have overclocked the processor but my first impression was that this could only have a positive effect when it comes to Folding At Home
I have also tested the RAM with MemTest and there were no errors after allowing it to run for over 3 hours.
From the 9 Folding assignments I have completed thus far the only the two that involved the p1499 and the p3039 produced these errors. I'm thinking it's a case of further configuring the FAH console so that such errors are avoided but I'm not sure what modifications are necessary if any.
If you have any suggestions I would be most grateful.
Many thanks in advance for your help.
With p1499 I get a cycle of the following error code :
[06:57:09] Loaded queue successfully.
[06:57:09] + Benchmarking ...
[06:57:11]
[06:57:11] + Processing work unit
[06:57:11] Core required: FahCore_78.exe
[06:57:11] Core found.
[06:57:11] Working on Unit 02 [January 19 06:57:11]
[06:57:11] + Working ...
[06:57:11]
[06:57:11] *
*
[06:57:11] Folding@Home Gromacs Core
[06:57:11] Version 1.90 (March 8, 2006)
[06:57:11]
[06:57:11] Preparing to commence simulation
[06:57:11] - Looking at optimizations...
[06:57:11] - Files status OK
[06:57:13] - Expanded 857201 -> 12361133 (decompressed 1442.0 percent)
[06:57:13]
[06:57:13] Project: 1499 (Run 598, Clone 0, Gen 8)
[06:57:13]
[06:57:14] Assembly optimizations on if available.
[06:57:14] Entering M.D.
[06:57:21] Protein: p1499_tet_1499
[06:57:21]
[06:57:21] Writing local files
[06:57:21] Gromacs error.
[06:57:21]
[06:57:21] Folding@home Core Shutdown: UNKNOWN_ERROR
[06:57:25] CoreStatus = 79 (121)
[06:57:25] Client-core communications error: ERROR 0x79
[06:57:25] Deleting current work unit & continuing...
[06:57:45] - Preparing to get new work unit...
[06:57:45] + Attempting to get work packet
[06:57:45] - Connecting to assignment server
[06:57:46] - Successful: assigned to (171.64.122.134).
[06:57:46] + News From Folding@Home: Welcome to Folding@Home
[06:57:46] Loaded queue successfully.
[06:58:01] + Closed connections
[06:58:06]
[06:58:06] + Processing work unit
[06:58:06] Core required: FahCore_78.exe
[06:58:06] Core found.
[06:58:06] Working on Unit 03 [January 19 06:58:06]
[06:58:06] + Working ...
[06:58:06]
[06:58:06] *
*
[06:58:06] Folding@Home Gromacs Core
[06:58:06] Version 1.90 (March 8, 2006)
[06:58:06]
[06:58:06] Preparing to commence simulation
[06:58:06] - Looking at optimizations...
[06:58:06] - Created dyn
[06:58:06] - Files status OK
[06:58:09] - Expanded 857201 -> 12361133 (decompressed 1442.0 percent)
[06:58:09] - Starting from initial work packet
[06:58:09]
[06:58:09] Project: 1499 (Run 598, Clone 0, Gen 8)
[06:58:09]
[06:58:09] Assembly optimizations on if available.
[06:58:09] Entering M.D.
[06:58:16] Protein: p1499_tet_1499
[06:58:16]
[06:58:16] Writing local files
[06:58:16] Gromacs error.
[06:58:16]
[06:58:16] Folding@home Core Shutdown: UNKNOWN_ERROR
[06:58:20] CoreStatus = 79 (121)
[06:58:20] Client-core communications error: ERROR 0x79
[06:58:20] Deleting current work unit & continuing...
And with p3039 I received the following after 20 steps into the Work Unit
[17:25:00] Completed 1000000 out of 5000000 steps (20)
[17:37:48] Quit 101 - Fatal error: NaN detected: (ener[13])
[17:37:48]
[17:37:48] Simulation instability has been encountered. The run has entered a
[17:37:48] state from which no further progress can be made.
[17:37:48] This may be the correct result of the simulation, however if you
[17:37:48] often see other project units terminating early like this
[17:37:48] too, you may wish to check the stability of your computer (issues
[17:37:48] such as high temperature, overclocking, etc.).
[17:37:48] Going to send back what have done.
[17:37:48] logfile size: 32292
[17:37:48] - Writing 32855 bytes of core data to disk...
[17:37:48] ... Done.
[17:37:48]
[17:37:48] Folding@home Core Shutdown: EARLY_UNIT_END
[17:37:53] CoreStatus = 72 (114)
[17:37:53] Sending work to server
I have no idea why these errors occur but I did complete a p2125 assignment in between and now I received another p2125.
I have overclocked the processor but my first impression was that this could only have a positive effect when it comes to Folding At Home
I have also tested the RAM with MemTest and there were no errors after allowing it to run for over 3 hours.
From the 9 Folding assignments I have completed thus far the only the two that involved the p1499 and the p3039 produced these errors. I'm thinking it's a case of further configuring the FAH console so that such errors are avoided but I'm not sure what modifications are necessary if any.
If you have any suggestions I would be most grateful.
Many thanks in advance for your help.
0
Comments
To (try and) add to it, when I have a machine encounter multiple errors I generally delete the core. It is located in the folder you are running FAH from and is in a format similar to FahCore_##.exe. Shut the program down, delete the core(s), then when you restart it will automatically download a new one.
Do you think it's a matter of relaxing the memory timings alone or also reducing the FSB frequency ?
The current overclock was tested with Prime before I began Folding At Home and it didn't produce any errors over a 10 hour period but I leave the Folding console running 24/7 so that might be pushing it to the limits when I receive the p1499s or p3039s.
I'll change the FAH Core as well although I did try that when the p1499 error occured and then I got a subsequent error when it started on the p3039
Heating isn't an issue with me either and the CPU and motherboard temperatures are well within the normal range.
I will reduce the FSB frequency and loosen the memory timings slightly and see what effect it will have on any future more demanding Work Units I receive. I did notice that both the p1499 and p3039 required more RAM to carry out their respective tasks. Both the CPU and the RAM are overclocked on this computer at the moment.
This seems to be the case with me at the moment Leonardo
I will leave the current p2124 task to complete and then I'll lower the FSB frequency by a few MHz and hopefully the FAH Core will not run into any problems with respect to the overclock.
Thanks again for all the valuable information you provide :thumbup