Friday, December 14, 2007

Hard drives suck

In my day job, we go through a lot of hard drives. It seems now that no matter how good a drive manufacturer is, drives still show up DOA, die early, die not-so-early, etc. This is the reason I like Seagate so much. While I fully expect their drives to start dying naturally at 18 months or so (they don't always, but that's when I really start keeping an eye on them), at least with their 5-year warranty I know I can get replacement drives without spending a lot of money constantly.

Anyway, a while back I got rather tired of replacing drives that were DOA or after less than 30 days of service, so I figured out that burning them in with a combination of SMART self tests and badblocks seemed to weed out the early failures rather well. I wrote this script to run those tests for me. All it needs is smartctl and badblocks (which are probably available on most live Linux CDs, but definitely on Fedora and Knoppix). It will work on any IDE or SATA drive. With a little work, it could probably handle SCSI drives, ATA drives on 3ware controllers, etc.

3 comments:

joshuadf said...

Very nice! Now if only I could use it with Dell PERC (aka Adaptec aacraid or LSI MegaRAID) controllers which are less friendly to open source.

Steven Pritchard said...

You could try "smartctl -d sat -a" on the device. It might work.

Unfortunately (or maybe fortunately for me) I don't have one of those controllers to test with...

LeeT said...

Has anyone thought of putting this into the Fedora Rescue disk part ... Might be good addition to testing or helping out with failing systems, much like memtest and so on ...

Thanks
LeeT