Intermittent system failure (Tyan Tiger MP)

drasnordrasnor Starship OperatorHawthorne, CA Icrontian
edited August 2003 in Hardware
Last Christmas I built a dual processor computer for my sister and thoroughly checked it out and ran burn-in tests before it left the door. Last month, it developed a problem in which the machine randomly and abruptly reboots itself for no apparent reason. It doesn't go through the Windows XP shut down sequence, just shoots straight to a black screen followed by the POST display. When Windows boots next time, after she logs in it displays the "Windows has recovered from a serious error." message. It does not ask to run chdisk during boot up. I'm working on getting the contents of the "Details..." tab from her, so that's coming.

The machine in question is a Tyan Tiger MP with dual Athlon MP2000+'s. It has 1024MB of RAM in 4 ECC PC2100 DIMMs. ECC on the RAM is set to ECC Scrub. The machine is housed in an Antec full tower with all the fan brackets populated and a 480W TruePower PSU. A couple of weeks ago, I took the machine in for an overhaul because it had begun to display this problem. I replaced the stock AMD fans with CoolerMaster HHC-001's, reseated the heat spreaders on her RAM, reseated the RAM, formatted the hard drive, reloaded Windows XP SP1, and installed the newest drivers for her Audigy, Linksys LAN card, and ATi FireGL 8800. Aside from the stock Windows software, it also has two instances of Folding@Home. I cannot duplicate her problem at my house, though I've seen it in action at hers.

I had her put it on a new surge suppressor, but that didn't work either. I have no idea what's wrong with it, though I feel like it's power-related. If anyone has a clue what's wrong with it, please share.

-drasnor :fold:

Comments

  • AuthorityActionAuthorityAction Missouri Member
    edited August 2003
    How long will it run till it turns off? I'd try testing it piece by piece, put one stick of ram in there and see if it works and try a new PSU.

    What does your sis need with a dualie? ;D
  • drasnordrasnor Starship Operator Hawthorne, CA Icrontian
    edited August 2003
    How long will it run till it turns off?
    It's random. I've seen it be as short as 5 minutes, as long as a few hours.

    I tried running it in several different memory configurations at home and at her house. Everything worked fine at my house, but the system still crashed regardless of the number and placement of DIMMs at her house.
    What does your sis need with a dualie?
    Absolutely nothing. It's a 24/7 folding rig and it's loud, so she gets to house it and chat it up. :)

    -drasnor :fold:
  • AuthorityActionAuthorityAction Missouri Member
    edited August 2003
    Sounds like something to do with power... maybe. I'd try a spare PSU and see if that works.

    Does she really need a gig of ram to chat it up lol? ;D

    I hope you get it working so those dualies will turn in some WU's :fold:
  • MediaManMediaMan Powered by loose parts.
    edited August 2003
    go to www.simmtester.com and have her download the DOCMEMORY program. It is a self-executing program and make sure she has a blank floppy in the drive. It will write the program to the floppy.

    Now reboot if the machine is set to boot off the floppy first. (If not change that in BIOS).

    The PC will boot off the floppy and she'll have to run the diagnostics. There are two choices via keyboard if memory serves me correct; basic and advanced tests. Have her run the advanced test which will take some time.

    See if it reports back bad memory errors.

    If it does then you have your problem.

    You may also just try setting BIOS to check errors instead of scrub. Scrubbing is really only useful in server environments.

    Hope this helps.
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited August 2003
    ...if it's fine at your house, but having problems at hers, I would suspect her AC power source. (Assuming the ambient temp is comparable in both locations).

    If possible, see if she can temporarily move the computer to a different location (on a different circuit). If you have access to a line conditioner you could try that. See if you can determine what else is on that circuit, and move as many things as possible (moving all would be nice) to a different circuit.

    You might also try one of those cheap line testers (any hardware store should have them - see picture) and see if she has an open or floating ground. Have her pay special attention to what else is happening at the moment it craps out, like an appliance coming on, etc. Also, check the voltage itself, I've seen 115VAC as low as 90V during summer months when power companies often practice what is known as a "rolling brownout". They drop the voltage in different areas at different times to reduce the overall load on the entire system. When I was a building engineer in Washington, DC we had to put phase protectors on all of our three-phase motors because of such trickery. The fact that your problem just started last month, when it got really hot in most areas, leads me to believe it might just be an abnormally heavy load on your power company, causing them to fiddle around with the current.
  • drasnordrasnor Starship Operator Hawthorne, CA Icrontian
    edited August 2003
    RAM failure wouldn't make a whole lot of sense, since the machine spends 3 minutes each boot verifying the integrity of 1GB of RAM. It has to zero it all out, write a bit to every location, and zero it again. Then again, just to be safe, I'll run SIMMtester next time I'm over there (soon).

    We live in Texas, so I'm inclined to believe you're right about the AC draw, plus her apartment complex is pretty decrepit and they probably skimped on installing wiring. As far as temps go, my house is cooler than hers by about 10 F (72F vs. 82F), but the case/CPU temps are comparable after I installed those solid copper dual heatpipe HHC-001's on her CPU's.

    Thank you much for the info. Also, do you think getting a UPS would help if it really is lousy power?

    -drasnor :fold:
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited August 2003
    drasnor said
    ...do you think getting a UPS would help if it really is lousy power?

    -drasnor :fold:
    I think it would depend on what the particular UPS is designed to do. (You'll need more help than I can give you - I'm not an expert in that regard) My layman's understanding is that some UPS units are basically just very simple battery backups, while others also act as line conditioners (regulating dips and spikes, plus taking out "noise"). I think it would depend on the sensitivity of the unit. If it is designed to take over in the event of a small voltage drop (say when it dips to around 105-110 volts), and the problem is indeed a brownout situation, then I think it would do it. If the unit doesn't cut in until the voltage drops drastically, then probably not. I'm sure someone with more experience with UPS units can give you a better answer.

    Good Luck!:)
Sign In or Register to comment.