Beyond RAID

I've always been leery of RAID on the desktop. But on the server, RAID is a definite must:


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2009/05/beyond-raid.html

I never had the pleasure to handle a RAID configuration by myself, but ZFS really looks promising. The only concern I have is that all those checks that ZFS does will probably add a lot of I/O operations…

…dumb hunks of spinning rust…"

yeah, but they are smooth, SHINY hunks of spinning rust.

Nice article Jeff.

Congratulations on another pointless post.

ZFS is one of “those things” that came out of Sun that they will very fondly be remembered for, in the vein in how I look back at DEC and the incredible stuff they produced.

  • Due to how ZFS handles parity data, expanding or shrinking volumes is cake. Delicious cake.
  • With ZFS you can actually have separate disks for the transaction logs. For instance, you could use SSDs for the logs to cache writes, and let it flush to the slower disks later, for a nice boost.
  • You can have local read caches on SSDs as well.
  • NFS and iSCSI are native.
  • Native LZA and GZIP compression support, albeit CPU intensive.
  • Again, due to how parity data is handled, the RAID5 write hole doesn’t exist. Lose power to a RAID5 array and data is lost because uncommitted parity data is lost.

ZFS is storage porn. Additionally, the X4500 uses 3.5-inch form-factor disks. Once 2.5-inch form factor disks become a lot more prevalent (I’m hoping 2010 will be the year OEMs will really push to 2.5’s), density will increase further. Even 1U pizza-box servers are pushing into 8-disk land, never mind storage boxen. :slight_smile:

Wow,
If by “fantastic, truly next-generation ideas” you mean “a complete knock-off of NetApp’s WAFL, which has been around for a decade and a half” the yes, I suppose ZFS is praise-worthy.

Now that’s taking all that energy conservation talk for a spin!

We interviewed David Brittle of the ZFS team on FLOSS Weekly #58 (http://twit.tv/floss58). Good interview, check it out. Makes me wish Snow Leopard was here sooner, so I can boot off ZFS and say goodbye to quiet disk errors.

If you’re intrigued by post-RAID data redundancy systems you might want to take a look at Isilon systems (http://www.isilon.com/). I have a few friends who work there, they roll their own drivers and file systems and the degree of redundancy they are able to provide on many terabytes of data is ridiculous (disk level redundancy, system level redundancy, etc, it’s like RAID+SAN+smarts), combined with crazy things like the ability to grow and shrink storage volumes by adding more servers or drives. I don’t know how detailed their for-public-consumption docs are, but they’re here: http://www.isilon.com/products/OneFS.php

More than likely this kind of technology will eventually filter down to commodity computers (since it’s mostly just software).

Welcome to the world of real datacenters :slight_smile:

creating really large storages (100’s of TB) requirems not only understanding the tradeoff for each RAID but also understanding the system requirements in throughput, latency and reliability.

http://design-to-last.com

ZFS rocks! I run Solaris on my home server and have been running ZFS for over 2 years (6x750GB drives in RAID-Z2, RAID-Z with dual parity).

One very important feature of ZFS is snapshots, which are like instant time-consistent copies of a filesystem at the time the snapshot was taken. Unlike copies, they take seconds to make, only occupy as much disk space as needed because unchanged files use the same disk blocks. You can take daily or even hourly snapshots, and they essentially make backups obsolete for the purpose of fixing human error (you can roll back to a previous snapshot, like a database transaction rollback, or copy files from the snapshot to the active filesystem). Once you’ve experienced a filesystem with snapshot capabilities, there’s no going back.

And yes, ZFS has taken many concepts from WAFL (just reimplemented better), but similarly NetApp borrowed NFS from Sun, what’s sauce for the gander is also sauce for the goose.

I hadn’t worked much at all with RAID, as I felt the benefits did not outweigh the risks on the desktop machines I usually build.

What risk? Just because RAID 0 is a dumb thing to do on a desktop machine doesn’t mean RAID as a whole is. RAID 1, 5, 6, 10, etcetera are all just fine for desktops. Provided you can afford the extra disks of course, and that you stay away from the awful fakeraid implementations found on most motherboards.

Why is RAID 0 even considered as a RAID configuration? The “R” stands for redundant. What’s redundant about Raid 0 if when a disk fails, the whole storage fails.

(Did the Orange captcha fail or did you want to do some good by using reCaptcha?)

Never really messed with any RAID configurations. I know the theory and all that but what is the software that implements it?

or is it all hardware based?

I’m pretty surprised that nobody has brought up that Apple is moving towards ZFS on both the desktop and the server in a big way.

Jeff,

You don’t believe in RAID for the desktop? Try striping a couple of Velociraptors together and then tell me that. I’m running that setup and it’s insane compared to a single drive. You can do multiple things on a striped drive and not thrash the drive like you do with a single drive, it’s amazing.

I have always enjoyed the concepts behind RAID. On the desktop RAID is generally used as a way to avoid proper backups so I haven’t really spent the massive time expense to understand it. With servers, just like you said, RAID shines. Disk prices have been getting to the point that with researching RAID on servers I’m starting to consider it for my desktops as crutch to help prevent data loss on hardware failure. Backup are important but sometimes it can be impractical to backup 100’s of GBs multiple times a day.

Back in the day, hard drives used to pause every 10 minutes or so to run recalibration or similar. If they still do, and do so asynchronously, performance problem. The result I recall was you can’t just stuff any old commodity hard drives in, but only the ones blessed by the raid supplier, which, you guessed it, weren’t so cheap.

Soon there will be a linux equivalent for the ZFS file system:

BTRFS
http://btrfs.wiki.kernel.org/index.php/Main_Page

Due to the license of ZFS they couldn’t use the ZFS in linux.
For the time being there is a fuse ZFS implementation for linux to experiment with…