Memtest Errors - Advice Please

DexterDexter Vancouver, BC Canada
edited October 2004 in Hardware
Trying to work out some strange problems on a system here at the office.

First it had some random error messages at bootup about the one or more registry files being damaged, recovered them from a backup. I did some HD maintenance, chkdsk /f, etc. Still had the trouble. Some googling pointed me to some reg cleaning tools, still did not fix the problem.

I decided to reinstall the OS (XP Pro.) After doing that, I had random problems with the video card (ATI Radeon 9200) after installation, it would lose the driver or fail to initialize the contol panel. I tried changing video cards, then got an NTOSkernel error, which prevented the box from booting at all.

So, I then decided to change the hard drive and the the IDE cable, thinking that maybe the problems were hard drive related, seeing as different items were corrupting that seemed a logical conclusion.

Or not....on the new HD, XP Pro failed on the install twice, in different spots.

In went the Memtest bootable CD. In 4 passes, the 512 MB of DDR400 had 4 errors, 2 error on pass 2, and the exact same 2 on pass 3. Passes 1 and 5 were clean.

So, I swapped sticks, in went an identical stick of 512 MB DDR400. 4 Passes of Memtest...and on pass 3, I had the exact same 2 errors as above, in the exact same addresses.

This is where I need some input: if the exact same addresses show problems on 2 different sticks, am I seeing a motherboard or CPU based problem here? Is something in the L1/L2 cache, or the CPU causing the same address to have a random error? Or do you think that since the new stick that went in is identical & was bought at the same time so likely to be from the same manufacturing batch, so perhaps maybe I got 2 sticks of RAM from a bad batch?

I am re-installing XP now on the new HD, new RAM, and original video card, I'll see what happens. Any good input appreciated.

Dexter...

Comments

  • Straight_ManStraight_Man Geeky, in my own way Naples, FL Icrontian
    edited October 2004
    Um, check the DIMM socket for ANY dust, a paper CHIP, metal chip(s), Persian or Angora cat hair, dog hair, eyebrow hair, or any other grunge of tiny kinds and then look at the contacts with a 3X magnifier. CAIG makes a PC Technicians Kit, that has contact cleaners and a corrosion preventer coating just for electronic contacts, and some tiny swabs and some normalish ones in it, slight corrosion or tiny bits of grunge on only one finger contact can cause a same location-multiple DIMM error, I have seen one persian cat heair foul the heck out of RAM access (not only did the hair have a static charge, it wrapped aournd one of the finger contacts and removing it fixed the problem which was driving me and the tech who brought it to me up the wall untill I got out the small 3X, 4X, 7X magnifier and a flashlight and found the thing, one dark brown tipped cat hair piece). Socket cleanliness and contact integrity check time, methinks, first thing. THEN maybe the chipset bridge that RAM is controlled from, then possibly CPU which you can test by swapping CPU into another compatible box.

    Also, think about this, look in BIOS, see if RAM settings got lost, bad timings can yield pattern errors in Memtest, but not likely just ONE of them. If BIOS has wrong RAM settings, check BIOS time, and R&R CMOS battery and clear and then reprogram CMOS as needed. Maybe if it timing and nothing in socket, or BOTH problems exist, then that combo of cleaning socket contacts and making sure BIOS has right CPU and RAM timings might solve your issue. Happened to me on SAME board that had the cat hair piece in the RAM socket in exactly the WRONG place to allow the one finger\straight spring pin to contact the DIMM, thus one part of one module had errors in MemTest, but as with you changing the DIMM did not fix until socket was cleaned AND the CMOS cell was R&R'd.

    As for XP, check the CD for tiny scratches and fingerprints &etc also, just in case, ok??? Maybe also a laser head cleaning CD if needed, or a CD drive swapout long enough to get XP loaded, THEN a CD laser head cleaning CD in the CD drive that is messing things up due to dirty laser lense???
  • DexterDexter Vancouver, BC Canada
    edited October 2004
    Thanks John.

    New install had the same Reg errors as above, I am repeating with a different install disk (forgot to mention I had tried that as well.) New CD has only ever been used once, absolutely no scratches on it. And now that intallation pass has failed during the XP install with an undefined error.... :scratch:

    I checked and air-canned the DIMM socket, looks good to the naked eye, no magnifying glass here, I can dig one up at home somewhere. All the BIOS timing and RAM settings are by default (motherboard is an Abit VI7, by the way.)

    I am switching to DIMM socket 2 now, just to see if there is any difference. But I'm starting to think I will be picking up a new motherboard tomorrow morning....this was supposed to be out the door to the customer today :mean:

    I have done 5 other systems in the past 2 weeks with identical hardware specs, and not had a problem with any of them. Very strange indeed..... :scratch:

    I'll update the thread when I know more.

    Dexter...
  • primesuspectprimesuspect Beepin n' Boopin Detroit, MI Icrontian
    edited October 2004
    Are you sure the ram and the motherboard are compatible? I've had this happen before, where sticks error in a certain board, but in another board they are just fine.
  • ThraxThrax 🐌 Austin, TX Icrontian
    edited October 2004
    I agree. It's possible that it's just an incompatability. Be it no-name brands with on-brand memory, or with a certain board and a certain memory that just don't play well together. I recall my k7s5a hating Crucial memory, wouldn't boot with it at all.
  • DexterDexter Vancouver, BC Canada
    edited October 2004
    Well, I suppose that is possible, but I have done 5 other systems in the past 2 weeks with identical hardware specs, and not had a problem with any of them.

    Just for a test, I installed the RAM (the new stick) into DIMM 2 instead of DIMM 1, and so far I have not had any problems. Maybe slot 1 was flaky somehow? I had to leave the system overnight, and have not installed any hardware drivers yet, so I'll do that and see if it remains stable. I'll update the thread later.

    Thanks guys.

    Dexter...
  • t1rhinot1rhino Toronto
    edited October 2004
    We have had that registry problem here at work on many computers. We think it was due to a MS update. After uninstalling the update, the errors went away.
    We told MS, and they are looking into it.
  • ThraxThrax 🐌 Austin, TX Icrontian
    edited October 2004
    Dexter wrote:
    Well, I suppose that is possible, but I have done 5 other systems in the past 2 weeks with identical hardware specs, and not had a problem with any of them.

    Just for a test, I installed the RAM (the new stick) into DIMM 2 instead of DIMM 1, and so far I have not had any problems. Maybe slot 1 was flaky somehow? I had to leave the system overnight, and have not installed any hardware drivers yet, so I'll do that and see if it remains stable. I'll update the thread later.

    Thanks guys.

    Dexter...

    I mean no offense when I say this, Dexter, but I figured you would be one of the last to jump to such a conclusion. Just because five previous systems exhibited similar characteristics doesn't mean the sixth computer is guaranteed to be free from flaws in design and manufacturing. It's very possible that the DIMM channel is flaky, and I think your testing even proved that.
  • DexterDexter Vancouver, BC Canada
    edited October 2004
    No offense taken. But my point is that if a certain combo of motherboard and RAM aren't happy playing together, I would think that I would have seen it on the other 5 systems as well.

    For instance, on your above mentioned k7s5a, if you built another k7s5a, and you put Crucial memory in it, would you expect it to work flawlessly? Or not, given the experience you had? So conversely, given that 5 of 6 systems worked perfectly with the mb / ram combo I have, I am disinclined to think it is an incompatibility problem.

    Today I installed everything, updated everything, and let it run, rebooted several times....no problems at all. And as changing the DIMM channel seems to have cured the trouble, I think that assumption is correct...I think that the combo of components is fine, but the DIMM channel has a fault. If the system exhibits problems again, then I'll be proven wrong.... ;)

    Thanks all for your input.


    Dexter...
Sign In or Register to comment.