Hard Drive Temperatures: Be Afraid

I recently had a noisy fan failure in my ASUS Vento 3600 case. The particular fan that failed was the 80mm fan in the front panel, which is responsible for circulating air by the hard drives in the front of the case. I disconnected it while I considered my options. There's not a lot of airflow by the hard drives in this case. I've actually had a hard drive failure in this system, which I strongly suspect was due to leaving the front fan disconnected.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2006/12/hard-drive-temperatures-be-afraid.html

I’m running dual hard drives as a RAID-1 mirror and it doesn’t look like DTemp can read their SMART attributes. Anyone know of good software that can read the temperature from drives in a RAID set?

Oy. The WD in my Shuttle is currently idling at 46 C, but of course being a Shuttle there’s no room for another fan. Thanks for costing me a night’s sleep, Jeff!

My HD was running hot and I got a $12 HD fan which attaches to the drive. It brought the temperature down from 113F to about 70F. I was told fans can alter the air flow, but this seems to be working.

Also, I use a free temp monitor at:

www.rsdsoft.com.

You should revise on your physics a little. Every bit of electrical energy that goes into your PC is converted to heat. Conversation of energy in action. This includes harddrives. If it takes 12W of electricity, it emits 12W of heat, if 70W of electricity then 70W of heat.

I agree about keeping disks cool though. It really is important. Silicon can take the beating, disks can’t.

That is, conservation of energy, not conversation of. Although the latter seems interesting too.

I’ve found that moving to 2,5" drives and Seagate had a great impact on the reliability of my data storage. The HD temperatures are below 40 degrees (my system has no active cooling) mainly due to reduced friction and power consumption of about 2-5W.

At home I have 8 fans running in my custom case monitored by a beefy front panel controller, quite often my hard drives get up to temp but I never thought of this as a problem, I normally only crank the fans when I hear my video cards fan’s going nuts from the temp, this of course cause a big puff of dust to fly out the back of my case (I’ve got a couple 120mms at the back that often tidy the dust out of my case).

Smooshing CPU, memory, graphics cards, etc. and etc. with the hard drives all into 1 case may have worked in 1985, but with today’s hardware I wonder if it’s time to go back to the concept that this stuff should be in separate containers.

Good post Jeff. I noticed this was a problem on any hard drive that I put inside an external USB/Firewire enclosure. What none of the manufacturers of those external enclosures tell you is that these things turn into little ovens for hard drives. Pay the extra money to get an external hard drive built by the manufacturer (like a Seagate or WD) rather than rig one yourself with an old HD and a cheap enclosure. Also make sure you always turn off the external HD when its not in use.

Thanks a lot Jeff. It hasn’t occurred to me that hard drives are that sensitive to temperature, but maybe that’s because I’ve never had a hard drive fail on me.

Sorry, but that’s not how it works in hardware world. If manufacturer guarantees the hard drive for 55 degrees Celsius, it is OK to run it at or close to that. Just like if a capacitor is guaranteed to withstand 250V, you can run it at 250V full time. Those folks are conservative, anything else is an open invitation for a lawsuit.

In fact, Apple doesn’t even begin to cool the hard drive actively in their iMacs until it reaches around 50-52 degrees C. They also don’t actively cool the processor until it gets pretty darn hot. Why do you think their computers are so quiet? Latest Intel chips are rated at 100 degrees Celsius. Unless you have to cool them before that (e.g. so that the laptop doesn’t burn your lap), it’s pointless to apply much cooling to them until they hit 80-90 degrees and when they do, even minimal amounts of airflow will do a great deal of cooling since temperature differential is so large.

Heck, I even use “fancontrol” utility on my Linux server in the closet. It spins the fans really, really slowly until it actually needs to cool things and then it starts ramping them up, also very slowly. I can’t hear the box working.

If manufacturer guarantees the hard drive for 55 degrees Celsius, it is OK to run it at or close to that. Just like if a capacitor is guaranteed to withstand 250V, you can run it at 250V full time. Those folks are conservative, anything else is an open invitation for a lawsuit.

I have no problem with people running their CPUs at 70c, or their video card at 90c. CPUs, video cards, and motherboards are slabs of silicon and wire. Hard drives are mechanical. They can’t be treated the same way.

I value my data more than almost anything else on my system. Being a little conservative on the hard drive front is worth it to me. But it’s your data, do what you want.

“Most manufacturers rate CPUs up to 70C, and GPUs commonly rate to 90C and beyond.”

This is already changing and will even more in the future; GPUs have traditionally been rated somewhere below “won’t melt the chip”, rather than the more conservative “gives good data”, since traditionally they were input-only and gamers could tolerate glitches. That won’t work as we hit an era of GPU-coprocessor.

Totally orthogonal to your point, I know. :wink:

DMB, hard drive MTBF doubles every 5-8 degrees cooler that a drive operates. This is well known and google digs up tons of tests. If you don’t mind having your drives eaten every 2 years in exchange for less noise, that’s your choice, but reducing heat can extend its life well past the warranty period.

maybe that’s because I’ve never had a hard drive fail on me

I’ve had a few fail, even though I end up completely replacing all my hardware in two years most of the time. You must be getting lucky, or maybe you use fewer machines than I do… I have a computer addiction. :wink:

On this machine, I have a very strong suspicion that the high operating temperatures (when the front fan was disconnected in order to reduce noise) contributed to the drive’s demise, exactly as predicted by the Xbit labs quote. It failed in a subtle way, too, with bad and unrecoverable sectors. It was actually a giant hassle just to figure out what was going wrong, which made it arguably more painful than an outright failure.

This is actually great information. I had never thought to check my HD temp. I’m probably guilty of losing my data to the likes of my HD overheating. Thanks for the info.

I too would love a way to get drive temps out of a RAID setup, but DTemp is useful for solo drives anyway - I’ll happily take a couple of numbers in the tray that tell me something useful over the other garbage that I keep having to remove (as Raymond blogged about again today). I just wish it could give me drive letters too, as I have a pair of identical drives…

Foxyshadis, it’d be interesting to find out such stats and how they were measured. The lowest quoted hard drive MTBF I’ve seen is 300000 hours. That’s 34 years if you run it 24x7. That is why MTBF by itself means nothing. Now MTBF of a large number of drives is another story, but even then you literally need to run hundreds of drives for years under controlled conditions to calculate statistically accurate MTBF numbers. Somehow I doubt your Google hits will withstand a rigorous analysis. :slight_smile:

Look at it this way. If what you say were true, Apple would not run their hard drives at 50 degrees Celsius, and Seagate would not quote 55 degrees as max continuous operating temperature for their hard drives either. Replacing most of them under warranty is not economical.

Interesting, seems odd that hard drives don’t come with any kind of passive cooling, a grill on the back, etc…

None of you know what MTBF is do you?
MTBF 10mil years doesn’t mean it won’t fail in 10mil years.

I can test MTBF like this:
1000 drives, run for 1000hrs. Total run time = 1mil hrs.
Say 2 drives are dead. Whoopee! MTBF is 500k hrs! Joy!

No. 2 drives DIED after ONLY 1000hrs!

See how useful MTBF is? There’s not even a standard way of deriving it.