Beyond RAID

Philip · May 27, 2009, 12:00am

“RAID’s various designs all involve two key design goals: increased data reliability or increased input/output performance.”

No - RAID 0 is litterally no redundant array. RAID = Redundant Array of Independent Disks… RAID 0 is an Array but with no redundancy and therefore no increased data reliability. In fact, RAID 0 is far less reliable because as soon as any 1 drive fails, you lose the entire array.

RAID0 - No hard drives lost - just “joined” in an Array.
RAID1 loses (normally) - 1 hard drive for every pair.

No pair of drives can be lost.
RAID10 - Both RAID 1 and 0 - no pair can be lost or the entire array is lost.
RAID5 loses at MINIMUM - 1 hard drive for every array.
Rebuilds take forever because every hard drive must be read in order to recover the information lost from 1 hard drive.
no two hard drives can be lost or all information is lost.
worst possible performance for writing
RAID5 losses at MAXIMUM - 1 hard drive for every three.
In the situation it is possible to lose multiple hard drives at once without losing all information, as long as no two hardrives from a triplet are lost.
Rebuilds are much faster.

Most other RAIDs are vairants on this - although some are quite creative. I’ve always found hardware faster in side-by-side comparisons.

RickCabral · May 27, 2009, 12:00am

mmm…

Thanks Jeff for the hardware porn. Nothing like a little hardware review to perk up a morning full of document authoring.

John_W · May 27, 2009, 12:00am

Forget RAID on the desktop.
Go with RAIPCs (Redundant Array of Inexpensive PCs).

With all of this fawning over RAID technology, everyone forgets that everything else is a serious weak link: The motherboard, the power supply, the DRAM, the controller cards, the OS, etc. In all the years I’ve been working with PCs I’ve had exactly 2 hard drives die on me. Compare that to the dozens of power supplies and motherboard that have died on me in the same time period and you see where I’m coming from.

Take my advice for SOHOs: Forget all of the massively expensive RAIDs with the 48 drive house furnace. Instead buy 3 identical cheap PCs. Configure them like this:

#1. This is your on-site server.

#2. This is your on-site hot-backup. Set up everything to be exactly like #1 except the host name/IP address. Create an automatic data backup schedule (daily or even hourly).

#3. This is your off-site backup. Just like #2 except its automatic backup schedule will probably be nightly or perhaps every other day.

If #1 fails for any reason, switch to #2. (Simply change the host name/IP address and reboot.) Although you’ll lose just the data that hadn’t made it to #2 yet, there’s a good chance it’ll be non-critical data anyway.

If #2 or #3 fails for any reason, get it fixed/replaced ASAP.

Jamie · May 27, 2009, 12:00am

I use a mirrored raid on my desktop; mostly because several of the VM’s are mission critical and are remote logged into by several people. (Who knew a Unix command prompt could be useful to a programmer?)

The only difficulty is that after a few weeks to months when Windows inevitably locks up, I have to dirty boot the system (incidentally, the VMs still work flawlessly - go figure!) and the Raid rebuilding really slows things down for a 4 or 5 hours.

p.s. My post script failed now that “orange” isn’t the captcha key.

Noah_Yetter · May 27, 2009, 12:00am

Jeff your earlier piece on raid-on-the-desktop that you linked to is pretty silly. It would be insane to use RAID 0 with any data you cared about; this is common knowledge of the highest order. You run RAID 1 on your desktop, for the same reason you run it in a server. My own desktop has two such arrays, a pair of WD Raptors for the system and apps, and a pair of 500gb disks for data. I run these on an Adaptec 31205 which is a true hardware RAID controller unlike the onboard crap endemic to newer motherboards.

bnitz · May 27, 2009, 12:00am

Due to the license of ZFS they couldn’t use the ZFS in linux.
…
Joepie on May 27, 2009 1:53 AM

Due to the license of the Linux kernel, they couldn’t use ZFS in Linux.

Fixed that for you

ZFS is compatible with other Free and OpenSource operating systems. It is already available in NetBSD, FreeBSD, OSX 10.5+ and of course, OpenSolaris. On Linux it requires FUSE in order to get around some of the limitations of the GPL license which is used in the linux kernel.

Gary_Gendel · May 27, 2009, 12:00am

I tend to agree with several people’s assessment of ZFS. This is one cool technology. Add in the ability to use SSDs as cache sweetens the pot. I’ve been running ZFS on a home server for over three years (RAID-1 and RAID-Z). The best promotion I can give is how it quietly protected my data even with occasional write errors caused by a flaky SATA driver (that was subsequently fixed in a later OpenSolaris build). The darn thing didn’t even break a sweat.

Couple that with the native CIFS service that works seamlessly with ZFS. It is orders of magnitude faster and less resource consumptive than the Samba service it replaced.

Mark · May 27, 2009, 12:00am

As a comp-sci educated consultant-sysadmin type person I just want to say that articles like this, where hardened programmers learn about and share info on how computers really work, are a cause for great joy.

(The number of coversations we have to have with folk who work in Java, Ruby, Python, whatever, where our replies start with “well, it doesn’t really work like that” and proceed to “what are you REALLY trying to do?” and end with “oh, that’s easy”/“oh, that’s provably impossible”…

george1 · May 27, 2009, 12:00am

ZFS is nice, but will ultimately die. Scott McNealy is an arrogant fopol and he wrecked his company in particular and IT in general with his Java bullshit. Java was intended to be a viable alternative to Windows, but Gates/Microsoft best him like a rented mule. Write Once Run Anywhere turned into Write Once Debug everywhere. Just like there used to be an endless string of not really compatible Unix (Alpha/Ultrix/Irix/SunOS/AIX/ …, so the same paradigm came into software. McNealy took his stockholders down the rabbit hole (all the while selling at the SEC allows every quarter), Oracle bought Sun for two reasons 1) it was a fire sale 2) save their “investment” in Java.

ZFS is already dead unless it goes to Linux/BSD/Windows.

McNealy like all his predecessors lost their and their shareholders asses by trying to do hardware and software. Only IBM wins that game. The list of those that died trying is long and distinguished: Dec, Prime, Apollo, Data General, Siolicon Grpahics, Intergraph. Must I go on? This is a business, not a party, the dot come era is over. Get used to it.

There are only two operating systems Windows and BSD everything else is doomed to failure. Linux is a microkernel nothing more.

dhanson865 · May 27, 2009, 12:00am

Noah Yetter you are wonderfully optimistic or oblivious to the endless September.

Newbies will always be in great supply and will constantly be asking “what drive should I choose to do RAID 0 for my PC?” some of the newbies like the word stripe better than RAID 0 but it is the same question and I see it month after month after month on the tech sites I frequent.

Sam · May 27, 2009, 12:00am

So Jeff’s discovered ZFS and RAID. It seems like every time you poke your head out of the Windows development microcosm you find something cool that you like.

Keep reading and we’ll see Jeff discover the joys of a decent shell and POSIXish userland, Solaris zones, ssh, vi …

I’ll be coming back in a year and SO will be rewritten in Perl running on a FreeBSD box under Postgres with Jeff expounding the benefits of bash job control and truss.

Keep going Jeff, you’ll shed that abstracted MS “PC” developer world-view soon

Dave · May 27, 2009, 12:00am

We did tons of benchmarking/testing with various RAID levels and controllers across Progress and MSSQL database servers. RAID 5 for your data array saw a 30% decrease in performance on Progress boxes alone. RAID 10 for data, RAID 1 for logs, OS and TempDB was pretty optimal on an average sized system.

Randolpho · May 27, 2009, 12:00am

“Raid array”

Tee hee!

“Forget RAID on the desktop.
Go with RAIPCs (Redundant Array of Inexpensive PCs).”

Say that out loud and tell me which you’d rather go with: RAID or RAIPCs.

Wait. On second thought, don’t tell me.

DMB · May 27, 2009, 12:00am

WTF are you all talking about? Get a decent SATA/SAS controller (LSI MegaRaid 8888ELP or 8880EM2, AKA Dell PERC6) and there’s NO write penalty, and read numbers are also scary on wide RAID5 arrays.

Bram · May 27, 2009, 12:00am

What would happen to RAID when disks go to solid state flash mem?

joe2 · May 27, 2009, 12:00am

I don’t see how this is coding related. Shouldn’t it belong on serverhorror.com?

post closed

Dave · May 27, 2009, 12:00am

@George

Ummmm ZFS is already available in several distros of BSD…

As bnitz posts a couple before you,

“Due to the license of the Linux kernel, they couldn’t use ZFS in Linux.”

“ZFS is compatible with other Free and OpenSource operating systems. It is already available in NetBSD, FreeBSD, OSX 10.5+ and of course, OpenSolaris. On Linux it requires FUSE in order to get around some of the limitations of the GPL license which is used in the linux kernel.”

I think ZFS will be around longer than you think

Sammy · May 27, 2009, 12:00am

In High School I sprayed some Raid on parsley and sold it to someone who smoked it.

Sami · May 27, 2009, 12:00am

Jeff,

Correction: RAID is Redundant Array of Independent Disks

Steve_O16 · May 27, 2009, 12:00am

@Phillip: “Congratulations on another pointless post.” - Congrats on a pointless reply. Posts like this are equivalent to my co-worker coming up to me and saying “Hey! Have you heard of ZFS!? It’s pretty cool!” which I very much like.