New RAID Array.. Bad time :(

Park_7677Park_7677 Missouri Member
edited February 2004 in Hardware
[sap story]
Well.. it hasn't even been a full week since I set up my first RAID.. and I'm sick of it.

I love the speeds.. no question about that. I tweaked it a bit and got 118MB reads and 92MB writes.. great stuff.. it's just that....I've lost so much. When people say "have a backup! you have 2x the chance of a drive failing and you loose everything!".. I understood that. I took precautions and kept a backup... I never thought stuff like what is happening.. would happen.

Just out of no where.. something's corrupt. Nothing unusual happening at all.. it's just gone! Tonight, after an hour or so of BF1942.. I go to put on a movie and go to sleep.. but my "Video" folder with all my videos (duh ;)) is corrupt.

CHDisk is running now... but I don't have any hopes for it. Two other small folders have gone south within the last 2 days too.. so I fear this one will fall with them.

I'm seriously thinking about just using the drives as normal. It's not that I have ultra sensitive data on the RAID.. it's just that I know soon enough I won't even have a working OS :(
[/end story]


Hardware/BIOS/Driver:
+ Hitachi Deskstar 2x 160GB w/ 8MB cache
+ SIL3112A SATA RAID (BIOS 4.2.43)
+ SATA RAID Driver v1.0.0.40

Software
WinXP SP1

I really want to keep the RAID because the speeds are awesome.. and I'm enjoying learning about them as I finally got them to play with. Does anyone have any suggestions or know anything about what's happening? I'd be willing to do anything needed to help you guys help me.. just let me know.

Thanks for any help.. :)

Comments

  • primesuspectprimesuspect Beepin n' Boopin Detroit, MI Icrontian
    edited February 2004
    That's weird stuff, especially if chkdsk isn't catching it.. I'd almost say faulty raid hardware or even partially failing drive. Run the hitachi drive test on those drives, individually. But if a raid is running, it just runs... I've never had "selective" glitches like that - it either works or fails bad.

    I suspect hardware, or cabling.
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    I'm going to run Hitachi DFT once CHKDisk is done Thx Prime.
    Here is a log when I tested my C partition over the past couple of days...
    Checking file system on C:
    The type of the file system is NTFS.
    Volume label is # system.

    One of your disks needs to be checked for consistency. You
    may cancel the disk check, but it is strongly recommended
    that you continue.
    Windows will now check the disk.
    The resident attribute for attribute of type 0x1c3e498 and instance
    tag 0xb500 is incorrect. The attribute has value of length 0x1c3e498,
    and offset 0xb500. The attribute length is 0x138.
    Deleting corrupt attribute record (144, ǃ࢝)
    from file record segment 10378.
    The file name index present bit in file 0x288a should not be set.
    Correcting a minor error in file 10378.
    Cleaning up minor inconsistencies on the drive.
    CHKDSK is recovering lost files.
    Cleaning up 2 unused index entries from index $SII of file 0x9.
    Cleaning up 2 unused index entries from index $SDH of file 0x9.
    Cleaning up 2 unused security descriptors.
    Inserting data attribute into file 10378.
    CHKDSK is verifying file data (stage 4 of 5)...
    File data verification completed.
    CHKDSK is verifying free space (stage 5 of 5)...
    Free space verification is complete.
    Correcting errors in the master file table's (MFT) BITMAP attribute.
    Windows has made corrections to the file system.

    26218048 KB total disk space.
    6489140 KB in 26781 files.
    7992 KB in 1901 indexes.
    0 KB in bad sectors.
    100092 KB in use by the system.
    65536 KB occupied by the log file.
    19620824 KB available on disk.

    4096 bytes in each allocation unit.
    6554512 total allocation units on disk.
    4905206 allocation units available on disk.

    Internal Info:
    04 82 00 00 15 70 00 00 d8 97 00 00 00 00 00 00 .....p..........
    a7 00 00 00 00 00 00 00 b5 00 00 00 00 00 00 00 ................
    aa 0b ae 00 00 00 00 00 42 ea 2d 0a 00 00 00 00 ........B.-.....
    6c f7 b9 00 00 00 00 00 52 df 7e 6d 00 00 00 00 l.......R.~m....
    26 25 ea 65 00 00 00 00 42 42 86 e5 00 00 00 00 &%.e....BB......
    99 9e 36 00 00 00 00 00 9d 68 00 00 00 00 00 00 ..6......h......
    00 d0 10 8c 01 00 00 00 6d 07 00 00 00 00 00 00 ........m.......

    Windows has finished checking your disk.
    Please wait while your computer restarts.
    Checking file system on C:
    The type of the file system is NTFS.
    Volume label is # system.

    A disk check has been scheduled.
    Windows will now check the disk.
    Cleaning up minor inconsistencies on the drive.
    Cleaning up 6 unused index entries from index $SII of file 0x9.
    Cleaning up 6 unused index entries from index $SDH of file 0x9.
    Cleaning up 6 unused security descriptors.
    CHKDSK is verifying file data (stage 4 of 5)...
    File data verification completed.
    CHKDSK is verifying free space (stage 5 of 5)...
    Free space verification is complete.

    26218048 KB total disk space.
    6963772 KB in 25983 files.
    8328 KB in 1885 indexes.
    0 KB in bad sectors.
    100092 KB in use by the system.
    65536 KB occupied by the log file.
    19145856 KB available on disk.

    4096 bytes in each allocation unit.
    6554512 total allocation units on disk.
    4786464 allocation units available on disk.

    Internal Info:
    04 82 00 00 e7 6c 00 00 67 8f 00 00 00 00 00 00 .....l..g.......
    b2 00 00 00 00 00 00 00 ba 00 00 00 00 00 00 00 ................
    8e bd 9f 00 00 00 00 00 fc 26 0a 0a 00 00 00 00 .........&......
    50 a9 ab 00 00 00 00 00 9c 1e 7a 75 00 00 00 00 P.........zu....
    3c ce ef 64 00 00 00 00 ae 1a 36 ec 00 00 00 00 <..d......6.....
    99 9e 36 00 00 00 00 00 7f 65 00 00 00 00 00 00 ..6......e......
    00 f0 08 a9 01 00 00 00 5d 07 00 00 00 00 00 00 ........].......

    Windows has finished checking your disk.
    Please wait while your computer restarts.
  • primesuspectprimesuspect Beepin n' Boopin Detroit, MI Icrontian
    edited February 2004
    You have a failing drive, or a loose or bad cable.

    One thing I've noticed is that the SATA headers on mobos nowadays are pretty cheap and crappy. I accidentally snapped one right off with not much effort on an asus board just a few weeks ago :mad: ... Check the headers themselves to see if they are marginal or even maybe loose from the board itself.
  • edited February 2004
    I had issues just like you're having now with my TVPC when OCing the FSB too high. The PCI speed isn't locked on the motherboard, which in turn forced a hard drive to work faster than it was capable of.

    The affected drive had all sorts of I/O errors. Entire 100GB directories would disappear, and/or files within directories (sometimes all of them, other times files with names starting with H on down to Z, while files with names starting with A through G were still there). Furthermore, when new data was written to the drive, it would always read back as corrupt (movie files would have terrible errors, and EXE files would simply not function).

    While the disappearing directories and files would often re-appear after running chkdisk, the files written to the drive which became corrupt would be damaged beyond repair. I was lucky that the drive affected wasn't the system drive (just file archive) or else the OS would have surely needed to be re-installed again, and again, and again.

    After talking with Prime over AIM, I thought of lowering the FSB. This fixed the problem, and the drives have been entirely stable to this day.
  • Geeky1Geeky1 University of the Pacific (Stockton, CA, USA)
    edited February 2004
    *in his best primesuspect voice*
    Have you run Memtest yet?


    :p:D
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    Well the folder is back.. don't know what's still in there or not.. but the dumped bad indexes are only 4KB.. so I don't think it's anything tooo bad.

    @SmJ:

    nForce 2 PCI is locked.. isn't it? I feel bad I have to question myself on that one... :nudge: I'm 99% sure it is. My FSB is only @ 200 anyway.. so I don't think that would be it.. who knows though..

    @Geeky1 w/ scary voice:

    Yeah.. memtest passes 12hrs @ 200FSB.

    --- here goes the Hitachi tool testing... :ninja:
  • Geeky1Geeky1 University of the Pacific (Stockton, CA, USA)
    edited February 2004
    Yes, the nForce2 has fixed PCI/AGP speeds
  • Mt_GoatMt_Goat Head Cheezy Knob Pflugerville (north of Austin) Icrontian
    edited February 2004
    I'm with Prime on it being a loose cable #1 and failing drive #2, in order of likliness.
  • Straight_ManStraight_Man Geeky, in my own way Naples, FL Icrontian
    edited February 2004
    Geeky1 wrote:
    Yes, the nForce2 has fixed PCI/AGP speeds

    Geeky AND Park, is the RAID chip embedded in the SB??? If so, RAID can be running at a speed based on a ratio to FSB and not locked.... while this would NOT affect a RAID card, embedded RAID could be affected.

    I KNOW, Shouldn't, but COULD.

    John D.
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    Hitachi DFT passed both drives in "Advanced Testing" --- so they're fine. w00t.

    Now, it's either the cables being bad or loose is the best we've got to go on. The cables are over a year old.. I got them with an MSI KT4 Ultra-SR (SR= SerialRAID). I don't think that would matter in itself--but it doesn't rule out the possibility of them just being bad. I'll double check connections ASAP.

    Today we're having "The worst winter storm we've had in a decade".. the roads are already covered in ice.. so there's no way of getting new SATA cables anytime soon. So I can't very well replace for testing.

    John -- It's not embedded, but it is onboard. I don't think it should be running at any different speed than what my PCI bus is running. Anyone know if I could check the clock of the controller somehow?
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    Damn.. something else.

    50% of my compressed files are corrupt now.. ZIP/RAR/GZip.. the whole list. I'm not getting CRC32 mismatches.. I'm just getting a plain "File corrupt" message when I test/extract an archive. :rant: This also apply to setups that have compressed files in them.. such as 3DMark03. I've downloaded it ~5 times now (from different locations), and during the install a CAB is corrupt.

    Last night, I stayed up and backed up anything I needed. If needed, I'm ready to nuke the array/format/whatever to the HDDs. I got SuSE 9.0 LIVE so I can still use the internet (just no Win32 apps) and not have to use the HDDs for a Win install.

    Should I try different BIOS versions? I've got the newest NF7-S one.. with the newest SI BIOS... anything wrong with those?
  • edited February 2004
    Could also try lowering the FSB to stock for awhile, as you have nothing more to loose...
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    Heh.. JUST did 1 minute ago. It's down to 166FSB.

    Now I need something that will compress and test archives non-stop.. like a benchmark... anyone know of one?
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    I may have found something...

    I made a GZip with like 5,000 small files. It's 40MB. Tested it on C: (where it was made) and no errors. I copy it to E: (just another partition on the RAID).. no errors. I check the driver for errors with Partition Magic.. 0 Returned. I defrag with Diskeeper. Check disk again, 0 errors. BUT the GZip is corrupt. :thumbsdow

    I repeat the process, but have uninstalled Diskeeper and just used the Windows Degragger. GZip still works.. :cheers:

    Yes, I have defragged the RAID before.. which allows the suspicion that Diskeeper is behind some or all of my corruption. I will continue to do the procedure above on gigs and gigs of data... see if I can get it to choke just using the Windows defrag.

    Diskeeper = :shakehead
  • edited February 2004
    What exactly is Diskkeeper? Does it just defragg, or us it some sort of all around hard drive manager?

    If these errors only happen shortly after messing with Diskkeeper, than I'd say it's the culpret for sure.
  • Geeky1Geeky1 University of the Pacific (Stockton, CA, USA)
    edited February 2004
    iirc, partitioning a raid array defeats a lot of the purpose of having the raid array in the first place.
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    Geeky1 wrote:
    iirc, partitioning a raid array defeats a lot of the purpose of having the raid array in the first place.

    How so? I didn't know that.. nor do I notice anything (besides the corruption ;)).

    Got a link or anything about it?

    Here are some ATTOs:
  • Geeky1Geeky1 University of the Pacific (Stockton, CA, USA)
    edited February 2004
    I've heard that partitioning a RAID array decreases performance. But I may be wrong. Have to ask Tex or something.
  • primesuspectprimesuspect Beepin n' Boopin Detroit, MI Icrontian
    edited February 2004
    It's weird tho, MS Defrag == same exact engine as diskeeper .. (MS licensed the tech from Executive Software for their defrag)...
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    I don't know.. but I swear it's Diskeeper killing my archives. I don't know about files & folders.. but we'll see.

    This is my 2nd time using Diskeeper and then trashing it after it killed files. The last time it turned my 40GB in RAW.. and no return. Maybe I should never use it again...
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    I'm going to change the P2P discard timing in the BIOS. It's set to 30 (us maybe?) and on Abit Forums people say to try 1ms. I don't know what this is.. but I've got nothing important to corrupt.

    I'll report back here...
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    4. Fixed SATA RAID-0 data corruption issue by adding a new option "EXT-P2P's Discard Time" in "integrated Peripherals". The default setting is "30 us"; which is recommended by NVidia. In case the problem is still there, try "1 ms" please.

    A release note for ABIT NF7-S 2.0 BIOSv 14. I've set it to 1ms.. and it works like a charm. I've finally been able to install 3DMark03 without a CRC32 mismatch error, and nothing else has died on me.

    I believe that was the problem. I'm going to try Diskeeper tomorrow to see if they can clear their name from my blacklist.. we'll see. Thanks everyone for helping out :smiles:

    Now, if anyone has problems with SATA RAID-0 .. we know to check EXT-P2P Discard Time in the BIOS. :mullet: The answer wasn't too far away this whole time ;D
  • edited February 2004
    But does it tank performance?
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    TheSmJ wrote:
    But does it tank performance?

    Nope, nothing noticeable. I thought maybe it did, so I benched it. Same scores, almost exact.

    Also, Diskeeper works fine again.

    "What exactly is Diskkeeper?" -- It's a defragmenter with options and information. You can defrag for max performance or max free space. It also tells you everything you could ever want to know about your fragmented harddrive -- volume frags, file frags, directory frags, MTF frags... it's all there. They also claim it does a better job than MS Defrag and does it faster. It also has neat pictures to show performance increases from a defrag (below).
  • edited February 2004
    How does it compare to O&O Defrag?
  • Park_7677Park_7677 Missouri Member
    edited February 2004
    TheSmJ wrote:
    How does it compare to O&O Defrag?

    I don't know about performance wise, but it looks like they're very similar GUI and information wise.
Sign In or Register to comment.