Unstable WUs?

MedlockMedlock Miramar, Florida Member
edited September 2004 in Folding@Home
I noticed earlier today my cpu was only running ~50% (HT P4) and so I thought it had finished a WU. I looked a little closer and it had only completed 60 frames, so I looked at the log, and...

[21:33:22] Completed 60000 out of 100000 steps (60)
[21:36:24] Timered checkpoint triggered.
.....
[21:54:37] Timered checkpoint triggered.
[21:56:44] Quit 101 - Fatal error:
[21:56:44] Step 60769, time 60.769 (ps) LINCS WARNING
[21:56:44] relative constraint deviation after LINCS:
[21:56:44] max 0.004592 (between atoms 45941 and 45943) rms 0.000040
[21:56:44]
[21:56:44] Simulation instability has been encountered. The run has entered a
[21:56:44] state from which no further progress can be made.
[21:56:44] If you often see other project units terminating early like this
[21:56:44] too, you may wish to check the stability of your computer (issues
[21:56:44] such as high temperature, overclocking, etc.).
[21:56:44] Going to send back what have done.
[21:56:44] logfile size: 68498
[21:56:44] - Writing 69182 bytes of core data to disk...
[21:56:44] ... Done.
[21:56:45]
[21:56:45] Folding@home Core Shutdown: EARLY_UNIT_END
[21:56:49] CoreStatus = 72 (114)
[21:56:49] Sending work to server

I'm not overclocking or anything. My ram timings are only slightly lower than SPD. 2-3-3-6 and stock is 2.5-3-3-7. It was a large WU so I'm thinking maybe it was my ram. But the only time a WU ever crashed on me is if I OC'd too far. I'm not now, so now I think it's the WU. BTW the WU is p130_1RYP_AAAA_UM, one of the large gromacs.

EDIT: That same client has recieved another one. It goes for 139 points. If it crashes I'll post again...

Comments

  • edited September 2004
    Taken from http://folding.stanford.edu/news.html :

    Folding@Home News/weblog

    8/20/2004 New projects: P130x
    We have some new exciting projects just being released. They are unlike any project we've done before and "break" some of the normal FAH rules:

    They're a lot bigger than the normal FAH WUs in terms of the RAM they take (hundreds of MB) and the net transfer (~5MB). To keep them only in the hands of those who have the resources for them, they require the "big WU" switch in the v5 client, enough RAM, and enough netbandwidth.

    Since they take more resources, there is are bonus points associated with them (right now, a 50% bonus over the standard benchmark value). This value may increase or change if needed.


    One other way that these WUs are different is that they are more likely to EARLY_END. Don't be surprised if this happens and don't worry: they are still scientifically important and clients get partial credit for the fraction completed.
    This is new ground for FAH, which is scientifically very exciting. I hope to have some important new results to report in January, once these WUs have run for a few months.
  • MedlockMedlock Miramar, Florida Member
    edited September 2004
    Thanks KF! It was the only one that crashed on me so far, so I was a lil worried. :)
  • edited September 2004
    Just keep an eye on it for now, if you don't get any more crashed WU on projects other than 130 I wouldn't worry about it. Just chalk it up to the new project. :)

    KF
  • edited September 2004
    I would say that I've seen about 10% EARLY_UNIT_END messages with the p1301 wus myself. The p1302's aren't quite as bad though. Also, the p1301 and p1302 are really worth 182 points, even though EMIII says only 139 for the p1301's. That's because when the p1301 was released to -advmethods, they were only giving a 30% bonus in points to them but a lot of folks complained and said they weren't going to run them so Stanford bumped them up to a 50% bonus on all large wu's.
Sign In or Register to comment.