Beyond RAID

ZFS is compatible with some other Free and OpenSource operating systems. It is already available in NetBSD, FreeBSD, OSX 10.5+ and of course, OpenSolaris. but intentionally incompatible with any using the GPL

ZFS uses the CDDL licence which was deliberately designed to be incompatible with the GPL, and written to licence ZFS … there fixed that for you

Tim: "What are your opinions on ecc memory?"
I’m not sure it’s worth the extra cost involved. If gap could be half as much I would seriously consider an upgrade to my current controller. I think the cost of ECC is really too high and I’m not even sure it’s worth it in the first place because I still don’t understand from what ECC memory should protect me.
A Gamma-Ray striking that very same cell? Worth some extra cost. Still not worth the cost involved considering the probability this happens.

After seeing the Sun monster on SATA disks, I would like to ask another thing:
What are your opinions on SCSI (all versions included)?

It is my opinion that SCSI is definetly overestimated and I find the Sun monster as confirmation.
I’d like to read your opinions on that.

Let me make clear that storage is the last of my concerns.

Jeff -

RAID 0 is essential in scientific applications where the data rates are enormous (astronomy, seismology, etc.). There seems to be no other option given the input data rates and the write speed of the currently available disks.

I’m amazed/surprised/saddened spinning platters are still with us. I’ve been looking forward to 3-dimensional solid-state memory for 30+ years. Still waiting…

  • Lepto

And if you’re really paranoid, you assume that you will lose more than one drive at any time. From my experience, once you blow out one drive in a RAID-5 setup the extra load rebuilding may push another drive over the cliff.

A study was done regarding the Mean Time To Failure (MTTF) for real-world environments. A fascinating read:
http://www.usenix.org/events/fast07/tech/schroeder.html

What is “Parity”? Put simply XOR.

Assume the minimum Three disk setup
Disk1 = 44 - 00101100
Disk2 = 68 - 01000100
Disk3 = Parity information is XOR of Disk 1 and 2:
Disk3 = Disk1 XOR Disk2 = 44 XOR 68
Disk3 = 104 - 1101000

Recover Disk1 with only Disk2 and 3 is accomplished by XORing together Disk2 and 3:
Disk1 = Disk2 XOR Disk3 = 104 XOR 68
Disk1 = 44

And doing it with more disks:
Disk1 = 44
Disk2 = 68
Disk3 = 35
Disk4 = 92
Disk5 = PARITY (Xor of all) = 23

Assume Disk3 dies
Xor the others:
Disk3 = 44 XOR 68 XOR 92 XOR 23
Disk3 = 35

Okay - there is a lot more to the implementation. But even so, such a simple concept works so well in practice.

You forgot to mention that the ordering of a ‘combined’ RAID system is important, so that RAID 10 is different to RAID 01. The difference is in the order the operations are applied (so RAID 10 is striped then mirrored across 4 disks and RAID 01 is mirrored then striped).

I’m in the process of having to help set up some systems like this as well though, and getting into the details of RAID when I’ve always ignored it in the past.

Funny… we have a lot of xw8600 Workplaces with 15k RAID 5 setups - and you are so foolish to think it is not needed as desktops? Come out of your hole more often.

George said:

McNealy like all his predecessors lost their and their shareholders
asses by trying to do hardware and software. Only IBM wins that
game. The list of those that died trying is long and distinguished:
Dec, Prime, Apollo, Data General, Siolicon Grpahics, Intergraph.
Must I go on?

What about apple?

That would come in handy when I am playing with old computer parts. Configure old drives from pentium 1’s and 2’s into a RAID. That would be fun some day for something to do. Man, some of those drives are pretty slow.

Jeff: You should take a read of Jim Gray’s 1981 paper “The Transaction Concept: Virtues and Limitations” and in particular the chapter “UPDATE IN PLACE: A poison apple?”.
The idea behind ZFS and WAFL before it have been around for almost 30 years. This is nothing new.

http://research.microsoft.com/en-us/um/people/gray/papers/thetransactionconcept.pdf

Have you ever dug into what it takes to make these “dumb hunks of spinning rust?” Hard drive design and manufacturing is a real feat of engineering… control-systems, materials science, tribology, signal processing, low-level firmware… all manufactured in huge volumes and sold at tiny prices. Hard drives offer a variety of very tough challenges and satisfying careers for engineers. You should look into it - it’s fascinating stuff.

I used RAID-0 when I was coming up with a low-cost streaming platform for Digital Cinema. Otherwise I wouldn’t get the 250Mbs throughput from disk to the projector.

As for ZFS fading away… That would be a shame on a number of different levels, especially since it’s a real breath of fresh air. Forget about all the times it already saved my data. Having the ability to throw new disks to increase my storage without dealing with partitions, mounts, logical volumes, etc. is a dream come true.

I believe that the licensing could be moved to GPLv2 and work had been done towards that goal at some point, but that won’t help since Linus is hard-nosed against that too.

There is a point at which backup becomes unimportant with redundant disks in the broadest sense. Particularly so when you can eliminate all the single points of failure (such as having all your servers in one physical location.) For example, I doubt very much that Google backs up their database.

When would you consider offline back up to become redundant?

I can’t believe the number of people here posting opinions after making the disclaimer, “I’ve never actually had any hands-on experience with RAID”… but I guess that this is a developer-oriented blog, so it should figure that a lot of you don’t really care much about the hardware on which your apps run - you just expect the IT monkeys locked in the datacenter to make it work like magic.

Of course, you’ve walked into one of the biggest firestorms of IT engineering and datacenter design by posting your opinion, however green, on the matter. Better yet, you weighed in, inadvertently, on things like diskless backup and emerging commodity SATA RAID pool technology. (It may have been around in one form or another for a decade or more, but it is only now gaining momentum, and in my mind, remains largely unproven in the data-center, so, someone else can go first on adopting it - I’m sticking with traditional SANs for now).

Nice job.

@Philip - What are you talking about? You quoted a correct quote, said no, then reiterated many of the posts points.

One tip I read to decrease the chances of multiple drive failures is to buy hard drives from separate lots.

Who is your target audience? This stuff is basic review for anyone who went to college for anything related to hardware or software. Are there vast legions of Windows developers who are high-school educated and find this novel or informative? I’m really quite curious. Can anyone suggest a good programming blog by someone who would consider this type of article beneath their target audience, because I want to be reading that instead.

Peter,

Should mention that during a two-drive outage and recovery from same, RAID 6 performance will get terrible. Depending on the I/O load you need to service, you might not be able to run production, even if ther is no data loss. Requiring two drives to fail to get to that state is a big improvement over RAID 5, of course.

As the disk system gets larger, same-mirror RAID 10 failures can be reduced by the dead half of the mirror being taking over by a hot spare. Doesn’t eliminate the problem, but closes the window of vulnerability automatically and quickly.

Mike

I bought 3 hard drives for my home computer, but instead of setting them up in a RAID5 (which has pretty bad performance) I set the first two drives as a RAID0, with daily backups of important data to the third drive. Knocked about a third off my load times in games when I benchmarked it (that ANANDtech article linked to was smoking crack - read http://www.hardwaresecrets.com/article/394/6 for example), and it also helps to reduce stuttering in games when its paging in new data, which is one of the most important things in getting a fast game experience.

Built it in January 2005, and the drive benchmark for my RAID0 in Sandra is still faster than their uploaded results for SSDs.

“If you take four hard drives, stripe the two pairs, then mirror the two striped arrays – why, you just created yourself a magical RAID 10 concoction!”

No, that’s RAID 0+1. RAID 1+0 is a stripe of mirrors and RAID 0+1 is a mirror of stripes. Reread the wikipedia article you linked to.