Simulation instability with F@H

TroganTrogan London, UK
edited October 2005 in Folding@Home
This has been happening for awhile now where a WU (mainly 600 pointers) completes halfway and then stops and I get new WU. I thought it would correct itself somehow but it hasn't :(

I'm attaching my FAHlog so you guys can take a look.

I'm not overclocking (don't know how too :p) and my comp is not over heating (well, no signs that it is) :)

What should I do? Thanks
«1

Comments

  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited October 2005
    Try stopping the client, then deleting all the cores you find in your FAH folder. They are the files like FahCore_xx.exe (the "xx" part will be a number). Restart the client and it will automatically download new cores as needed.

    I've had this problem a time or two. Sometimes I managed to narrow it down to a specific problem, or fixed it with the method I detailed above. Other times the problem went away as mysteriously as it appeared, for no apparent reason. :confused:
  • TroganTrogan London, UK
    edited October 2005
    Thanks prof. I did as you said, I'm hoping it works.

    I had a core named 'FahCore_7a'. I havn't seen that before.

    Thanks again
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited October 2005
    ...I had a core named 'FahCore_7a'. I havn't seen that before...
    I believe that is one of the new-ish Amber cores. I have it on my computer as well.

    Good luck, man. :)
  • TroganTrogan London, UK
    edited October 2005
    I'l see how things are during the next few days and let you know.

    Thanks
  • QeldromaQeldroma Arid ZoneAh Member
    edited October 2005
    Unfortunately I had a similar problem not that long ago and I wound up having to give my PC a "bath".

    If profs idea doesn't work out you might try this-

    Clean me please!

    Your log looked hauntingly familiar and I don't overclock either.

    Hope that helps. (PS- I have an Amber loaded up now too - should be done soon)
  • TroganTrogan London, UK
    edited October 2005
    Thanks Q :thumbsup:

    I'l see how things go first :)
  • csimoncsimon Acadiana Icrontian
    edited October 2005
    Dust bunnies are always the first suspect just because they're so common.
  • QCHQCH Ancient Guru Chicago Area - USA Icrontian
    edited October 2005
    Thanks Q :thumbsup:

    I'l see how things go first :)
    You call me??? Oh that "Q"...
  • TroganTrogan London, UK
    edited October 2005
    @csimon: I checked for dust bunnies and there are some :eek: Whats the best way to get rid of them?

    @QCH: Sorry Q...QCH. From now on, you will be known as Q and Qeldroma as Qel :D
  • lemonlimelemonlime Canada Member
    edited October 2005
    Wouldn't hurt to run some loops of Memtest86+ :thumbsup:
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited October 2005
    ...I checked for dust bunnies and there are some :eek: Whats the best way to get rid of them?...
    Shutting the computer down and vacuuming it out is a good start. You might want to use something like a Q-Tip (what is it with "Q's" in this thread? :vimp: ) to clean the dust off of fan blades and other tight spots.

    The place many people forget is the fin area under the HSF. They'll clean the top of the fan off, then ignore what's below it. The closely-spaced fins make for an efficient air filter, meaning that the whole thing eventually gets clogged with a layer of dust. I've seen cases where you could peel it off like a piece of felt. The best bet is to remove the fan, then clean the fins out directly. Depending on how far apart the fins are, you may be able to take a butter knife, wrap a layer of paper towel around it, then carefully scrape the fins clean. Hold the vacuum cleaner hose next to it to suck up the dislodged goop.

    If you haven't replaced the thermal compound under the HSF lately, this is a great time to do so. :)

    lemonlime wrote:
    Wouldn't hurt to run some loops of Memtest86+ :thumbsup:
    Definitely. :thumbsup:
  • TroganTrogan London, UK
    edited October 2005
    OK...I think this is the time to tell the truth - I havn't lied if thats what your thinking :D.

    First off, i'l run memtest if I get a chance. The comp is always in use at somepoint which doesn't leaving enough time to run memtest. I may run it overnight and check it in the morning or something.

    Prof, i'l try your suggestions as best I can but I have never opened up my case (or any other case) and I've never worked with hardware before which makes me a n00b :(. Like I said, I'l try your suggestions when I have time but i'm worried incase I turn a working comp into a broken computer.

    Anways, Thanks :)
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited October 2005
    ...The comp is always in use at somepoint which doesn't leaving enough time to run memtest. I may run it overnight and check it in the morning or something...
    That's the way I do it. :)
    Prof, I'll try your suggestions as best I can but I have never opened up my case (or any other case) and I've never worked with hardware before which makes me a n00b...I'm worried in case I turn a working comp into a broken computer...
    It's not as hard as you think. If you're really worried, see if you can get a friend with more experience to come over and show you what to do. Otherwise, just be careful and think things through before acting. I would not recommend removing and reinstalling your HSF to replace the thermal compound until you've had a chance to watch someone else do it. It's not hard, but it's also not a place where you can afford a mistake.

    As for the routine dusting and vacuuming, there's nothing to it. Just make sure when you start removing screws to open the case that you get the right ones. Pay attention to what you're doing, relax a little bit, and you'll be just fine. :)
  • QeldromaQeldroma Arid ZoneAh Member
    edited October 2005
    As Qel & Q Qontinue with the Questionable Qase of the Quirky Qomputer by Qleaning wiith Q-tips:

    If you're nervous about doing the CPU paste replacement job, you can try to just simply clean the HSF first without removing it and see if that helps (a Quid [haha!] says it can't hurt at all). This should be well within your ability.

    I know how you feel. I was in cold-sweats with my first CPU / HSF seating. Profs advice is excellent! If you get there- let us know. I can't think of a better place to ask for the help you need.

    (Sorry, but I Qouldn't resist!)
  • TroganTrogan London, UK
    edited October 2005
    I'l have a good look inside the case to Familiarise myself with whats inside. I will be extremely careful when hoovering - I'l be using a hoover since I don't have a vacuum. I'm quite excited now, going to try and learn something new :)

    Also, I remember reading in a few threads how people use a can of air. How does that work?

    Thanks prof
  • csimoncsimon Acadiana Icrontian
    edited October 2005
    If you're worried about it you can always just buy a can of compressed air and go at it with that. Like prof said though ...always turn it off and unplug the psu first. Give it a few moments (at least 20 seconds) then open up the side panel and have a look inside.
    It looks very intimidating at first but once you get used to it it's not so bad.
    Look for the large heatsink/fah/cpu ...that's usually where most of it accumulates.
    Look for the largest chunks of dust and spray those areas out real well.
    Once you get all of that you can concentrate on the other areas with whatever air you have left.
    It doesn't have to be perfect once you learn the key areas to clean ...like cpu, gpu, ram and psu. You can also air out the drives as that can help them live way longer as well.
    And that's about it really ...just be careful that when the can starts to get really cold make sure that it doesn't have much liquid shooting out of the end ...that's not really good if you get too much of that on the electronics.
    Lastly you can spray off the inside and outside of the case ...then whipe the outside and you'll have a nice clean shuiny system.
    A q-tip w/ rubbing alcohol is great for the nooks and crannies ...the alcohol drys real fast so it won't harm even the cpu if you give it a few seconds to dry.
    After it's all done if you suspect any of the electronics to be wet just allow it to dry ...if it's bad enough use a hair dryer.
    Use a lot of common sense ...it'll take you through the whole process!!!!
    If there are 3 things that computers don't like it's water, static electricity, and dust bunnies (which cause it not to cool so well).

    Good luck! :thumbsup:
  • GargGarg Purveyor of Lincoln Nightmares Icrontian
    edited October 2005
    If nothing else, get a can of compressed air, take off the side panel, and give everything a good squirt (paying extra attention to the heatsink and fans). It's super-easy. Easier than working with a vacuum, IMO.

    Be prepared to cough a bit, if the computer has been accumulating dust since its birth :thumbsup:

    edit: Csimon beat me to it.
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited October 2005
    ExQQQQQQ's me, but Qel's last post was Quite Quaint.

    InQwedible. ;D
  • TroganTrogan London, UK
    edited October 2005
    I appericiate everyone jumping in and giving advice that i'm certainly listening too :)


    So, what does a Qan of Qompress air look like? And what is a Q-tip? :scratch:

    Thanks everyone
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited October 2005
    ...what does a Qan of Qompress air look like? And what is a Q-tip?...
    Q-Tip is a brand name for cotton swabs. In jolly old England they probably call them something else. (Queue-Tips?) :D
  • TroganTrogan London, UK
    edited October 2005
    Now that makes sense :)

    The name 'Cotton Swab' is good enough :D

    I'l let you know how things go soon :thumbsup:
  • TroganTrogan London, UK
    edited October 2005
    I just checked the FAHlog and to my annoyance I saw the log showing "Simulation instability". Plus, it keeps closing a connection once its downloaded a WU.

    I'l post my FAHlog again. I'm guessing the Dust Bunnies are playing a part in this :(
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited October 2005
    Seeing that your computer did successfully process and send the results of two complete units, it would appear you have a competent system. But I see what your talking about with the multiple early ended units towards the end of the log. It can be an indicator of a system overclocked beyond stability, of an overheating system, or simply of a work unit package that is flawed. I've had that happen before - too far overclocked, too warm, and just plain bad luck in faulty work units. If the early end happens twice, and you are confident that your system is stable and not overheated, you can just clear out the Folding "work" and "queue" folders and start with a freshly downloaded unit (after F@H restart, of course).
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited October 2005
    Concerning the cotton swabs -

    If you use them in your ears first, then you can move to the computer components and lubcricate as you clean. Dual use, gotta love it! :D
  • TroganTrogan London, UK
    edited October 2005
    Leonardo wrote:
    ...you can just clear out the Folding "work" and "queue" folders and start with a freshly downloaded unit (after F@H restart, of course).
    I can see the "Work" Folder but I don't have a "Queue" Folder. Do you mean "Queue.dat"?
    Leonardo wrote:
    Concerning the cotton swabs -

    If you use them in your ears first, then you can move to the computer components and lubcricate as you clean. Dual use, gotta love it! :D
    ;D


    Thanks Leo :thumbsup:
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited October 2005
    Queue.dat it is. Some might argue that one should not remove that file, as that is what often contains your next assignment, that one should not interfere with the scientific mapping that Stanford has set up for the current crop of proteing molecules. But...chances are, the next protein model in that queue will be corrupted just like the one you wish to replace.

    Now, back to swabs...
  • TroganTrogan London, UK
    edited October 2005
    Leo, done as suggested. Fingers Crossed :D
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited October 2005
    If the same problem persists, you've got a hardware problem - overheating, low voltages, or something like that. Make sure you monitor your board's voltages. A low +12v could be causing the problem.
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited October 2005
    Leonardo wrote:
    If the same problem persists, you've got a hardware problem - overheating, low voltages, or something like that. Make sure you monitor your board's voltages. A low +12v could be causing the problem.
    Now why didn't I think of that? :)

    Check your Vcore voltage, too. (That's the one which directly powers the CPU.) Many MB's misdetect the proper voltage for your processor, usually erring by setting it a smidgen too low. You might want to run CPU-Z to make sure things are properly set in the BIOS for your particular CPU. Tell us the name and model # of your MB, too. My board has an issue where it actually undervolts the processor to a lower voltage than what the BIOS setting shows. When I bump it up a notch it reports an overvoltage, however every second-party utility I've tried (SpeedFan, MBM5, etc) indicates that it is now running exactly on spec.
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited October 2005
    Prof, I was speaking from experience. I had a string of misfires on large work unit proteins this summer on my system 1 that lasted for about a month. At that time I was running a P4C 2.8@3.5 (CPU in system 3), with what I thought were manageable temperatures. I lowered the temperatures modestly by improving the case's airflow, but the 'early end' problems persisted. I boosted CPU core voltage without resolving the problem. Finally, I determined that it had to be the PSU. The 12v rail was within tolerance, but barely. I swapped out the PSU for a higher quality unit, Antec TruePower 430, and the instability went away without lowering the CPU clock.

    Suspect PSU is the Robantan, which is now in system 3. It's a decent PSU, just not robust enough for high overclocks coupled with two instances of Folding@Home (hyperthreading mode). I've just purchased another TruePower 430 from Mizugori for System 3. Looking forward to installing it. (Ha! Then I'll have another spare PSU for the next Folding box. But wait, that 'next' box hasn't been cleared with the wife unit yet. :confused: )
Sign In or Register to comment.