My continuing RAID woes!

edited July 2006 in Hardware
This continues from http://www.short-media.com/forum/showthread.php?p=405530 in the emergency section.

Ok so after recovering my Raid using testdisk, recomended by profdlp.

I completley started again etc.
New format, reinstall.
Everything appeared to work. However chkdsk kept finding loads of errors.

I adjusted the EZT-P2p from 30us to 1ms in the bios, that appeared to help.
Suddenly after a normal shutdown, it failed to boot the next morning simply stating 'unable to boot operating system' aaarrrrgggghhhh! WTF!

Test disk again, sometimes it found errors in the boot sector othertimes it didn't.

Anyway using Test disk managed to copy the bacup bootsector over the normal one. and xp loaded again.

any suggestions how to fix this problem. i can't believe that the raid should be this unreliable.

i am using bios 27 wth latest 2006 siliconimage bios 4.2.76 embedded. Could this be the cause?

should i go bac to using the standard abit bios?



Abit NF7-S v 2.00
2x Seagate 200GB Barracuda 7200.9 drives.

Comments

  • edited July 2006
    I was hoping to report back that I solved the problem and that I was able to share the fix with all of you with similar problems. Unfortunately I have had to give up on raid so that I can get a system working again.

    I last got Raid-0 working great on the KT7-RAID with lots of help from Tex, (when I was more active on these forums, in the icrontic days) but I never tried raid on the board (NF7-S v2) as I couldn't afford 2 disks when I upgraded.

    It just does not work with my drives Seagate Barracuda 7200.9 200GB

    I have looked into everything. Tried latest SATA bios embedded in 27 and used 4.2.50 included in Abits offical 27 bios (as recommended by Silicon Image) I have set the Ext-P2p to 1ms in the bios.

    changed the version of the sata drivers.
    I have set the jumpers on the drives to S150 mode, although seagate say this shouldn't be necessary.

    I have replaced the sata cables.

    memtest check out fine.

    Seatools (seagates disk checking utility) reports both disks 100%fine.

    Data Corruption always occurs, and the system regularly becomes unbootable.

    DiskTest always works recovering it, but its just crazy.

    While it worked I got good speeds, but for a system which only lasts a few days, this is shocking.

    I am gutted. I only bought these new drives for the purposes of setting up the raid-0, still at least I have 2 excellent drives now.

    For me at least Raid 0 is a big no no. Now to re-install my system for the 6th time in the last 2 days! This ones for good.

    Thanks to those of you which offered advice.

    TestDisk is now a permanent part of my data recovery pack for emergencies.


    Kind Regards

    Chris
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited July 2006
    any suggestions how to fix this problem. i can't believe that the raid should be this unreliable.
    I CAN believe RAID is that unreliable. My answer to your problems is ditch RAID 0...yesterday. It's just not worth the risk. Not what you wanted to hear, but probably the best advice your will get.

    Lammy, I was also in your RAID class on the Abit KT7/KT7A motherboards. Around three or four years ago quit using RAID 0 and haven't looked back. It's really of limited utility, even when running at peak performance, at least in a PC running off an integrated chip.

    You will not find many, if any at all, of the PC RAID 0 boosters of a few years ago advocating for it now. in 1999 and 2000, RAID for home users was somewhat of a novelty and we enjoyed playing with it. That novelty has worn off.
  • edited July 2006
    Hi Leonardo,

    My problems have got weirder and weirder.

    I have scrapped the raid idea. But when I re-installed XP on a single drive, I had THE SAME PROBLEM!!!!!!!!

    Chkdsk reported 'minor inconsistencies' which it said 'is not data corruption' I hadn't seen that message before.
    It repaired them upon reboot, but after they were still their. Another reboot later the drive failed to boot 'unable to find operating system'

    WHAT!?

    Seagate Disk utility says both drives check out fine, (it took several hours of checks each)

    So I unplugged that drive which just buggerred up, leaving one sata drive connected. I tried to do a fresh winxp install on to that , but part way through the system crashed!

    I've got it going again now, but this is SO weird!

    I'm beginning to think it is a motherboard problem or just basic incompatibility between 3112 and Seagate Barracuda 7200.9 200GB hard disks.

    I'll let you know if this finishes installing ok, and if I get any data corruption this time.

    !! WTFis going on Grrrr!
  • edited July 2006
    Ok, brand new install on my 2nd sata drive nothing else installed.

    The first thing I did was run chkdsk

    Found this minor inconsitencies thing see attached picture.

    Exactly what happened on the other drive prior to total unbootability

    then sure enough 2 full proper shut downs and reboots later

    "Windows could not start because the following file is missing or corrupt:
    <Windows root>\system32hal.dll.
    Please re-install a copy of the above file"

    I feel like throwing the whole thing in the bin!
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited July 2006
    The chkdsk results are typical of a drive which has suffered a few crashes/freezes which are not (usually) hardware related. The system chokes, then some of the stuff which was in the process of being rearranged on the drive gets misreported.

    What brand and model PSU do you have and what is the Amperage rating on the 5V+ and 12V+ rails? Weird drive errors can be caused by low power. Fortunately, this is a simple fix and is usually not too expensive.

    Like Leo, I am a former RAID-aholic who has been clean now for over five years. Back in the day of 5400rpm drives and UDMA33 data transfer rates it showed a definite real world performance boost, but with todays drives being so much faster in both regards it's not really worth the risk. The exception would be if you had a dedicated (and expensive) RAID Controller which managed things hardware-wise. Onboard "hardware" RAID typically found on MB's these days is really little more than a glorified "software" type of thing.
  • Mt_GoatMt_Goat Head Cheezy Knob Pflugerville (north of Austin) Icrontian
    edited July 2006
    Have you run memtest lately? Have you tested the PSU yet? Both of these can contribute to the kind of things you have been experiencing. If those are OK then I would suggest going back to a BIOS revision of 22 or earlier. I was always sold on 17 myself.
  • edited July 2006
    OK, memtest checks everything out fine. I'm not overclocking.

    I've returned the new hard drives to the store. (-25% restocking fee grr)

    So I'm set up with my previous hard disk (Seagate Barracude 7200.7) I know this one works fine.

    I'm still noticing weird problems which never used to exist.

    CHkdsk at the cms prompt reports free space errors or the above 'minor inconsistencies' however if I re run it again (never having used the /F option) the errors won't be detected. If I run chkdsk /F and schedule a check on reboot.
    The reboot chkdsk finds nothing wrong.
    The windows check disk won't find anything wrong most of the time either.

    This drive works much better than the 2 new ones I bought.

    I'm starting to think the power supply may be at fault. It was on my list of things to replace a while ago, as the auto fan speed sensor fails, so I have to have the fan set on high (manual switch:auto, high, low) otherwise it overheats.

    These are the readings from inside the bios.

    CPU Core voltage: 1.62V
    VCC Voltage (+2.5V): 2.63V
    I/O Voltage (+3.3V): 3.27V
    +5V: This ones weird it fluctuates from a corrupt numeric "4.:0" - "4.98V"
    +12V: 11.74
    -12V: -12.37
    -5V: -5.15V
    3.3 Dual Voltage: 3.56V
    Standby(+5V): 5.38V


    All of these figures fluctuate quite a bit, but the +5V often displays the false numeric "4.:0" without the "

    How do these look?
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited July 2006
    Lammypie wrote:
    ...All of these figures fluctuate quite a bit, but the +5V often displays the false numeric "4.:0" without the "...
    The +5V rail is what powers the logic board (control circuitry) for your hard drives. If it's gone wiggy on you - and it would appear that it has - you'll get the exact hard drive problems you've encountered.

    I'm always a bit hesitant to spend other people's money without mentioning possible alternative causes. Keep in mind that the weird voltage reading could be caused by an intermittent grounding of something that shouldn't be grounded (such as a pinched wire somewhere, or a MB standoff in the wrong place, etc), or even a glitch in the sensor itself which is falsely reporting a nonexistent problem.

    Can you borrow a PSU from someone to see if that makes a difference?
  • LeonardoLeonardo Wake up and smell the glaciers Eagle River, Alaska Icrontian
    edited July 2006
    +5V: This ones weird it fluctuates from a corrupt numeric "4.:0" - "4.98V"
    Echo what Prof said. If that 5+ reading is accurate, your PSU needs to move out to pasture for fan testing only. It could be voltage regulators on the motherboard, but I really doubt that.
  • WuGgaRoOWuGgaRoO Not in the shower Icrontian
    edited July 2006
    yup...sounds like a psu problem...its funny how important those things are..
  • RyderRyder Kalamazoo, Mi Icrontian
    edited July 2006
    Never believe what your bios is telling you though....even though the other voltages have stable readings.

    Unless you have measured your rails with a DMM and established what the bios normally says when things are fine...then you can use it as a reference to determine when things have gone south.
    Long story short...before you bin the PSU....find a DMM to test it with.

    Although with your description.....PSU is a good place to start.
  • Mt_GoatMt_Goat Head Cheezy Knob Pflugerville (north of Austin) Icrontian
    edited July 2006
    Mt_Goat wrote:
    Have you run memtest lately? Have you tested the PSU yet? Both of these can contribute to the kind of things you have been experiencing. If those are OK then I would suggest going back to a BIOS revision of 22 or earlier. I was always sold on 17 myself.
    The PSU Nazi strikes again! :cool2:
  • edited July 2006
    Ok, here it goes, I don't have a DMM, however Spinner gave me 3 of his spare dodgy 'testing' only PSUs.

    I ran tests on my PC with nothing connected other than mobo, and then again with everything connected.

    One thing I noticed was that my current Globalwin 420W makes a funny hissing noise when the hard drives are plugged in.

    That weird '4.:0' Voltage reading on the +5V went away on all tests with the other PSUs, however one of spinners dodgy psus showed that on a different rail.

    All of the other psus (300W, 430W & 450W) had much more voltage stability (less fluctuation). and none exhibited the weird hissing noise I got when the hard disks were plugged in.

    So while I'm only going on the mobo readings which aren't accurate, it is good enough for me to tell that the voltage fluctuations I was getting are not normal, the hissing is not normal, and the '4.:0' reading probably means off the scale. It also rules out any grounding of the mobo. I'm going to post on the forums on the abit site, see if any of the engineers will say at what point the voltage won't be returned accuratley and says '4.:0'

    So I've ordered a new PSU, a Antec NEO HE. It arrives tomorrow and I'll see if this solves all of my problems - I think it will.

    It will also be the most high tec piece of kit in my system. (great for upgrading when vista is released)
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited July 2006
    I'd say you have tracked the problem down pretty thoroughly. Good luck with the new PSU. :)
  • edited July 2006
    Yep FIXED! Woohoo!
    I had a bit of a fright when I saw the -5V rail figure of "-69.9V" in the bios, until I worked out that the PSU doesn't do -5 as its obsolete and the NF7-S v2 I use, like all modern mobos don't require it. FEW!

    My new Antec NeoHE 550w has solved everything! Its the quietest I've come accross and appears to have the least voltage fluctuation in the bios I've ever seen.

    With the modular cable management system it uses, it is also the most elegant I have seen.

    I ran PCMark05 before and after, and there was a slight improvement, after the new supply was fitted.


    Thanks for everyones help in tracking this problem down.

    (maybe I'll reorder some new drives and try raid again - not likely!)
  • profdlpprofdlp The Holy City Of Westlake, Ohio
    edited July 2006
    Glad it worked out for you. :D
This discussion has been closed.