The Hot/Crazy Solid State Drive Scale

As an early advocate of solid state hard drives …

… I feel ethically and morally obligated to let you in on a dirty little secret I've discovered in the last two years of full time SSD ownership. Solid state hard drives fail. A lot. And not just any fail. I'm talking about catastrophic, oh-my-God-what-just-happened-to-all-my-data instant gigafail. It's not pretty.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2011/05/the-hot-crazy-solid-state-drive-scale.html

I use my SSD fully expecting it to fail. Just like I date crazy girls fully expecting them to stab me: Always have that backup plan!

My SSD simply holds my OS and apps, while my big mechanical drive (anything Western Digital Caviar Black) holds my docs. The projects that require serious I/O (like my localhost and 20 drupal sites) run off the the SSD but are, of course, version controlled through Git. That drive could nuke before I’m done with this post and I would be just fi

Hey thanks Jeff for posting…
I have been thinking about SSD’s for a while now, and wondering what their reliability is like… given what you said, it certainly sounds fine to put up with the hassles for the enhanced speed. Considering most of what I do these days is either backed up locally via NAS, and Dropbox, and GIT for dev work, it really isn’t like the bad old days (I remember buying an Amstrad PC-1512 - ok, I am originally from the UK, and that was the first real PC I could afford back then - it had a WD 10mb HD that fitted into a card slot… wow that thing failed…).

So, my rather modest AMD based laptop should squeal with delight once I get an SSD for it… happy hardware is good hardware!

Let the madness begin!

Do you know why they are failing? Have you had any problems getting free replacements? Have the manufacturers said why the reliability is so bad?

It might be costly, but would you consider buying 2 and using RAID 1, so that if 1 fails, the other one will still be working?

Let’s do the math:

Average life of SSD = 227.375 days (based on Wills’ data)
Price of recommended SSD = $524.99

SSD tax = $2.31/day, ~$70/month, ~$843/year.

So really, SSDs are just the Netflix Chaos Monkey of your personal backup strategy :slight_smile:

A bit more math (wish I could edit the previous comment):

4 TB RAID 01 or 10 setup using WD Caviar Blacks = $640
1.2 TB RAID 01 or 10 setup using WD VelociRaptors = $1,000

Obviously this is only feasible for a desktop, but for $115+ more you get:

  1. 5 to 16 times the capacity
  2. Years, not months, of life
  3. Instant and automatic backups
  4. Comparable speed for most practical purposes

@Can Berk Güder, comparable speed? Sequential, sure, but nobody cares about that on SSDs except marketing weasels. What we care about is random 4K, which the Vertex can do 250MB/s incompressible, while HDDs struggle to do 1MB/s each (the high-density - 3TB - 7200 3.5" ones, smaller/older are far worse).

Oh and… you stick those 3.5" drives into a notebook. You could stick two VelociRaptors, but notebooks don’t supply 12V to the drives, so you don’t give them power.

@Can Berk Güder, anyone saying that SSDs and HDs offer comparable performance just hasn’t used an SSD.

@Mircea and @Poromenos

Honestly, I’ve used neither (SSDs or a RAID setup). I’m not denying SSDs are fast, nor am I trying to bash them. I’ve also pointed out that this setup wasn’t feasible for a laptop.

The point I’m trying to make is, I think Jeff is looking at this from a very strange angle. When I look at the data, I say “SSDs are blazing fast but they fail like crazy.” Jeff says “they fail like crazy, but I still love them, so let’s buy more SSDs.”

I’ve been keeping an eye on SSD prices (which are much higher here than the US) for a long time now, but I think I’ll wait a bit longer after reading this post. I just can’t afford to replace a $1k drive every 8 months.

@Apps 5575 RAID mirroring (RAID 1). who mentioned an SSD array?

I am curios. If SDDs fail so often, why do manufactures list MTBF and warranty similar to HDDs?

From OCZ Vertex 3 240GB:

MTBF - 2 million hours
3 year warranty

I agree with your comments, I fitted out a 6 year old ThinkPad X60 with a Vertex 2 last month and it now feels faster than my Quad Core i5 desktop w/ 6Gb of RAM. Sigh.

I wanted to ask if anyone knows much about the SMART monitoring on SSDs, and whether it can predict these failures. If they are failing due to running out of “good” flash to spare, then presumably such a prediction should be really accurate - a monitoring program should be able to watch SMART prefailure stats and pop up “Buy a new SSD in the next 6 weeks, or your data is toast!”. If they’re failing for some other reason then I guess all bets are off, though.

@Can Berk Güder, RAID is for high availability and/or performance, it’s not a backup. High availability allows your server to continue servicing while one of its disk is dead. It does not protects you from “oops, I deleted all my work!”. Backup should.

“it’d be like going back to dial-up internet or 13” CRTs or single button mice. Over my dead body, man!"

http://whitewhine.com

Just sayin’

Shouldn’t all of those drives have been under warranty when they failed?

I have a 64 GB Patriot SSD that’s three years old and still going strong. It came with a ten year warranty which seems pretty incredible. I wonder what their replacement strategy is in nine years.

Anyway, now I am paranoid and off to double check my backups.

I’d be willing to bet the listed failure rates, while higher than platter HDDs, are not as high as this sample set would lead us to believe. I have two SSDs (Supertalent and Intel) with a total running time about 1150 days with no problems. That’s not to say they won’t fail tomorrow but I keep my backups upto date so I am not really afraid of the possibility.

@FooBarTwit I assumed what I meant by “backup” would be pretty obvious from the context.

The MTBF of any drive is clearly not 15 days. That’s plainly just a warranty issue.

I can confirm your experiences; I built a PC for my mother and couldn’t understand what was wrong when it wouldn’t start. Windows wouldn’t boot, and neither would any of the safe modes. Pure dead.

I was stumped because I thought “It’s only been in here less than a year,” it can’t be the SSD. The worst part, as you mentioned, is that it was a total catastrophic failure like I’ve never seen with HDDs. I managed to recover the data using boot tools, but couldn’t write to the drive, perform certain diagnostics, format or erase partitions. It was disastrous.

OCZ replaced it without hesitation (Vertex 40GB), but it taught me a lesson about the real reliability of SSDs. Naturally, I have still used SSDs in my last two laptops!