Raid5 Problem

edited April 2005 in Hardware
Hi,
I had a Raid 5 setup with 4 200GB Maxtor drives. Last week one of them broke and im still waiting for a replacement. Yesterday the computer got really slow and i ran chkdisk. Because it took so long i went to bed and when i looked at it this morning it was stuck at "checking free space" at about 35%. I hit the reset switch and the Promise S-150 SX4 controller reported the array as off-line. Now there is an array defined but it reports two missing drives and one as free. I dont think there was any data lost on the drives, but i need the controller to recognize the 3rd drive. What can i do? There was a lot of data on it and it really would suck losing it all.
Thx for your help!

Jan

Comments

  • TexTex Dallas/Ft. Worth
    edited March 2005
    I wouldnt touch a thing till you get a replacement. Unless you were running raid-5 with three drives and a hot spare as the fourth (which would explian the slowness as it was rebuilding the array onto the hotspare which would be very slow and take a long time... then it needs a new drive to function properly and to rebuild the redundancy etc..

    rebooting while its running in that mode was very bad. I personaly would never run raid-5 on a promise controller as they are not known for their robust raid-5 solutions.

    go email promise tech support RIGHT NOW and ask how to proceed. There are very few of those controllers in teh field and probably fewer running raid-5 and if you want GOOD advice and not guesses considering that the data is important go to Promise for the answers on how to proceed.

    Raid-5 systems I have messed with (I am not a fan btw...) require the failed drive to be replaced to retain the array. If you were running a hot spare and those were big full ide drives it could take 24 hours or more to rebuild the raid-5 when you replace the drive. Until its replaced its running in a waaaaayyyyy reduced performance mode. State of the art scsi is a differant picture. The drives are smaller and rebuild faster. As far as I am concerned you should not use raid-5 without an extra hotswap spare with scsi/ide or sata. Its just wrong on many levels.

    Tex
  • edited March 2005
    The problem is that i was running it with 3 drives in a critical status, because the replacement isnt here yet. There was no rebuilding going on at that time.
    I will try to contact Promise then.
  • edited April 2005
    Ok, I got the new drive today, put it in and of course everything is gone from the array. So i reformated installed XP Pro. While restarting the first time the array went into critical mode again! I am getting really pissed off with this controller now, but what can I do? Get one from another company?
    Tex you can probably give me the best advice what to do now. Basically i have these 4 200GB Maxtor SATA drives and a ****ty Promise controller, since you´re saying raid 5 is not the best choice, what is? And what controller should i choose?

    Thank you a lot!
    Jan
  • GargGarg Purveyor of Lincoln Nightmares Icrontian
    edited April 2005
    Wait to hear from Tex, but I'd would get a controller from another manufacturer. I've had bad experiences with Promise, and I know I'm not the only one.

    While you're at it, if you have the cash (these aren't cheap), this would be a good time to get a nice caching RAID controller to improve performance.
  • ShortyShorty Manchester, UK Icrontian
    edited April 2005
    Heed Tex's words with regard to IDE RAID-5. Just don't do it. Im surprised your PC wasn't running like a dog. RAID-5 on IDE without a dedicated XOR processor & cache memory runs drain slow :(

    Im a fan of RAID5 if it's done right. The hot spare idea is essential. Waiting for another drive to arrive is bad and seriously runs the risk of the data loss you have experienced. I work in enterprise. If a RAID5 loses a disk, we have a hot spare and a cupboard full of spares!

    What can I suggest/bring to the table? Ditch the IDE's, get some SATA drives and a 3ware controller card with a dedicated processor & cache memory. It offloads the parity calculations from the CPU and onto the card. It also brings "hot swap" capability. Just make sure you have a local dealer with available kit incase one spins off it's coil.. or a spare in a drawer!

    Incidently, the cache memory also makes a monster difference. If your budget can dictate it, get SCSI! RAID-5 will still crush it hard but SCSI can take the strain :) I wish I could afford it!

    Im sure soothsayers will read this and say "oh that's rubbish, I run RAID-5 on 4 laptop drives blah blah blah". Maybe you do but when the thread starters situation hits you, it's not pretty.
  • edited April 2005
    Thanks for the input shorty, but i already have 4 SATA drives.
    I looked at the 3ware Escalade 8506-4LP controller but its like 300 Euros and I´m not really sure if i want to spend that kind of money yet...
    I set up a Raid 0+1 now because I really need the redundancy, and I´ll check my budget for the 3ware :)
  • TexTex Dallas/Ft. Worth
    edited April 2005
    JanKir wrote:
    Thanks for the input shorty, but i already have 4 SATA drives.
    I looked at the 3ware Escalade 8506-4LP controller but its like 300 Euros and I´m not really sure if i want to spend that kind of money yet...
    I set up a Raid 0+1 now because I really need the redundancy, and I´ll check my budget for the 3ware :)

    Raid 0+1 or even a raid-0 you backed up to non raided drives would be my options.

    Remember even redundant raid like raid-1 or raid 0+1 or raid-5 does not mean you can escape from backing up. (as you just learned) It only protects from a complete drive failure. A controller failure... Or a corrupted FS... or you delete crap by mistake or... Well geeez the list goes on for ever...

    None of those are protected with ANY level of raid.

    You need to develop a BACKUP routine. Run raid-0 if you want but BACK IT UP to a non raided drive or even another raid-0 array on a differant raid controller is the safest bet.

    My servers in the house here back themselves up to other servers across my gigabit lan nightly so my data always exists on at LEAST two seperate computers nightly and crap like our outlook mail folders back themselves up several times a day to multiple machines as we would both sheet if we lost emails.

    Run a pair in raid-0 for the performance and BACK IT UP often to non raided drives is my recomendation.

    I am not not just taking out my butt. I do this for a living and this is what I do to protect both my own data and my customers also.

    Tex
  • ShortyShorty Manchester, UK Icrontian
    edited April 2005
    ^^^^^^^^^^^^ Speaks the truth.

    Il echo that every word. Backup, backup and backup again.

    Get Acronis True Image. It supports a fullbackup & incremental (point in time) appended backups. I use that to backup my server network at home. Each machine runs it and backups over the network to another. I know it's a pain when you have masses of data to have to shift around but when you do a big buy of kit, prepare it in :)

    The 3ware controllers are expensive but well worth it. They take the load off the CPU and software RAID (especially RAID-5) is like jelly. That's why I don't use it at home (yet anyway). Investment is high when moving to a serious storage solution but once it's in place, you can relax a little :)
  • edited April 2005
    First of all I think my Problem is one of the SATA Cables... The 0+1 Array that i set up yesterday went into critical again, after a couple of reboots and its always the same channel that just wont see the drive. Its not the channel or the drive, so i will buy a new set of cables tomorrow. I have a spare 80GB Seagate drive so i will use that as a backup, but i think a Raid 0 with 4 drives will be to risky, and a raid 0+1 will lose to much storage (50%), so will the performance be good on a 3ware and raid5 ? If it is i might as well get the 3ware and leave the storage like that for a couple of years (maybe change the drives sometime for larger ones) ;)
  • TexTex Dallas/Ft. Worth
    edited April 2005
    When you get your raid-5 back up again post some benchmarks for us.

    Tex
  • edited April 2005
    with the promise controller ?
    I can remember that the writes were pretty low, not higher than 40,000 in Atto no matter what latency I tried and the reads were about 80,000 max. In general not very fast, but acceptable to me if i get the safety of raid 5. Can you estimate the performance of the 3ware controller in a Raid5 with the 7200 rpm 200GB maxtor drives?
  • TexTex Dallas/Ft. Worth
    edited April 2005
    I think 40,000 for writes was awesome with that setup.
  • edited April 2005
    Ok I set it up to a Raid 5 array again, and here is an Atto Screen.
  • TexTex Dallas/Ft. Worth
    edited April 2005
    fine atto for raid-5
  • edited April 2005
    Wow, Tex said my Atto is good !
    I guess my array works now, has been stable for the last couple of days, so I will keep this controller for now. I will get a PCI-Express one when I will do a complete Rebuild once I saved up some money and my current system won't run everything anymore.
  • GobblesGobbles Ventura California
    edited April 2005
    We use 3ware and we use Acronis.

    3ware pwns, I have the same controller your talking about. I also use a lot, and I mean a CASE of 3ware 8006-2lp controllers a week.

    Acronis pwns. Ive restored partitions that were 16gig to 8 gig partitions and vice versa, acronis does not even flinch while doing it. We use the enterprise version.
  • edited April 2005
    Hi Gobbles!
    So you are using the 3ware in a Raid 5 config with 4 Drives?
    Can you please post up an ATTO screen for me?

    Thanks a lot.
  • TexTex Dallas/Ft. Worth
    edited April 2005
    It really doesnt matter what his atto is. The 3ware is a whole differant level of raid controller. it's one of the few IDE raid controllers that are used in corporate america.

    They are much more robust and dependable and thats really what raid-5 is all about.

    I wouldn't touch raid-5 on a promise controller for all the tea in china. I would only even consider it on a 3ware for IDE raid-5.

    Consider the Promise as low end toys for the enthusiast and the 3ware is a stable mature robust answer designed for mission critical applications used by the buisness world.

    How can I put this?

    You own the equivalent of a hopped up Kia. Gobbles uses a Mercedes.

    Tex
  • FlintstoneFlintstone SE Florida
    edited April 2005
    Then, when we start talking serious, there are these:

    http://www.lsilogic.com/products/megaraid/index.html

    A little more money, but they're the real thing!!

    Flint
  • KwitkoKwitko Sheriff of Banning (Retired) By the thing near the stuff Icrontian
    edited April 2005
    After talking with Tex, I've become a big fan of RAID 10, especially for a DB server. I'm going to bench our new server with Atto. Ultra320 SCSI RAID 10 should put up some nice numbers.

    As for backups, we're using a Dell LTO2 tape unit (200GB native) with Veritas Backup Exec. I was thinking of getting an autoloader, but I need to take the previous day's backup offsite.
  • FlintstoneFlintstone SE Florida
    edited April 2005
    Kwitko,
    Depending on the controller and the amount of ram on the controller, as Tex reminded me, Atto's I/O test won't even touch the disks. It sits in the onboard ram on my controller and there really is no disk activity in the bench at all. When I had a slower controller with less ram on a slower bus, the same thing occurred. My new controller is on a PCI-X bus and I still can't bury the controller in I/O's. YET!

    Flint
  • KwitkoKwitko Sheriff of Banning (Retired) By the thing near the stuff Icrontian
    edited April 2005
    It's a Dell PERC controller with 256MB RAM. Hopefully that will contribute to good I/O scores.
  • TexTex Dallas/Ft. Worth
    edited April 2005
    As flintstone mentioned. With 256mb cache all your testing really is the cpu/memory subsystem on the controller.

    The old Elite 1600's would hit about 140,000 on reads and 80,000 on writes with one or 100 drives in raid-0 when testing with ATTO. Thats just as fast as you could get stuff out of the cache. Flintstone just switched to the LSI 320-2x like I run and it hits like 250,000 on writes and 300,000 on reads using atto because the onboard cpu and cache are so much faster. ATTO never kisses the disks at all. We have 512mb of pc3200 DDR and a much faster onboard cpu.

    Your Perc4 should be in between the old Elite and our 320-2x. They have made several Perc4 versions and the later ones are much faster. I would bet that with 256mb cache its a 320-2. And its still using the older sdram cache and not DDR like ours. You may have a newer model but I bet thats what it is.

    I bet your ATTO's hit like 200,000 on reads and 150,000 on writes using ATTO due to the slower cpu/memory. Its faster then the older perc3's but not as fast as the new 320-2x's or Perc4e's.

    Tex
  • KwitkoKwitko Sheriff of Banning (Retired) By the thing near the stuff Icrontian
    edited April 2005
    How should I set up the benchmark test?
  • TexTex Dallas/Ft. Worth
    edited April 2005
    If your using ATTO change only the total length to 32mb

    Tex
  • KwitkoKwitko Sheriff of Banning (Retired) By the thing near the stuff Icrontian
    edited April 2005
    Benchies! Tex, the controller is a PERC 4e.
  • TexTex Dallas/Ft. Worth
    edited April 2005
    Whats it in? I have a Perc4e sitting on the desk in front of me. Just don't have anything I can run it in. MY dfi with pci-e slots won't boot with it in. So many non servers right now are really only geared to support video cards in their pci-e slots.

    I'm surprised its as fast as it is. It should of been slightly faster then a 320-2x based on the memory subsystem and cpu. Do you have open pci-e slots in that Dell server? If I can't find a board to run this thing in I would like to at least test it to know it works before I sell it.

    Tex
  • KwitkoKwitko Sheriff of Banning (Retired) By the thing near the stuff Icrontian
    edited April 2005
    I don't know if it has PCI-e, I'll have to check tomorrow. I'll also check the controller's RAM.
  • TexTex Dallas/Ft. Worth
    edited April 2005
    The perc4e is pci-e. So its got at least one. (grin) Thats why its faster then mine.

    The ram is 256mb of pc2700 cas2. As I said... I have one sitting on my desk two feet from my face.

    Tex
Sign In or Register to comment.