Wot not Samsung Drives?
Isn't everything they do/make supposed to be the bees-knees?
aka The Samsung can do no wrong mantra.
The folks at backup-as-a-service outfit Backblaze have again done something astounding: they've not only looked at log files but analysed them, this time to figure out which vendor's disk drives offer the longest working life. There's a scary-looking histogram in BackBlaze's post detailing the results that shows Seagate drives …
> I have never had a problem with Seagate or Hitachi drives.
Gawd, I have.
Seagate used to be fine, untikl they bought Maxtor. Then te Maxtor "quality issues" seemed to dominate the entire output...
I've not bought a Samsung drive since Seagate bought that business - I really, really hope they haven't done the same...
I've seen hard drives of all brands fail. Not many, and my observations are anecdotal, not systemic. Which is why this BackBlaze study is so useful to the industry. They are publishing systemic results.
Not completely controlled, but systemic. For instance, it could be that there was a manufacturing problem with one batch of drives to produce the 120% failure rate on that one group of Seagate drives that stands out in the results. But over time those should smooth out. And more importantly, it is real world data. If I were them, and I saw that sort of failure rate with a particular brand of drive, I'd probably put it on my Do Not Buy list, sort of like the HP Laserjet 1100 that tended to multi-feed in an office environment after about 2 years.
My little home ZFS server (16TB and counting) uses these drives:
1 Hitachi HDS5C3020ALA632 ML6OA580
10 SAMSUNG HD154UI 1AG01118
1 SAMSUNG HD204UI 1AQ10001
1 ST31500341AS CC1H (Seagate)
Currently, the only drive with any issue at all is the seagate...
The 2GB Samsung was to replace an identical failed Seagate ST31500341AS...
Shame you can't get Samsung anymore - no great speed, but very quiet and very reliable in my experience.
Everyone knows that Seagate have been going down the pan for quite a few years, however the WD drives didn't do brilliantly and they really didn't have a similar enough number of them to make a proper comparison.
I moved to WD from Seagate some 5-10 years ago. Looks like I might have to look at Hitachi more closely and hope WD don't get the technology transfer the wrong way which is so often the case.
Doesn't help Enterprise much. Well, no, it doesn't directly. But it doesn't inspire confidence in Seagate as a manufacturer since problems appear across the range of consumer disks.
And as both enterprise and consumer, I know what to look for when I buy disks for home.
Showing my age, but I miss Maxtor :(
I've got 8*WD20EARS drives* (WD Green 2 Tb) in 4*2 bay Netgear ReadyNAS Duo v1 boxes. Each box is set up with RAID 1 (simple mirror) configuration.
In three years I've had one mechanism fail (one of the oldest). The ReadyNAS alerted me the driver was failing before any serious consequences. Popped the drive out, dropped in a new one and the NAS sync'd up the new mechanism overnight with no data loss. WD support sent out a replacement mech before I returned the faulty one - and made sure the model number matched too because the NAS is a bit prissy about supported drives.
* I have about 18 WD mechs elsewhere in my home systems and this remains the only failure in 7 years, but these 8 seem most relevant to the OP.
All disk fail sooner or later. I always mirror disks in machines, so my main concern is if the 2 disks fail close together in time. That is why I try to install disks that are likely to not have the same lifetimes, so preferably made by different manufacturers or at least different batch numbers.
As the old saying goes, there are two types of disk drive:
- those that have failed
- those that haven't failed yet.
When you stop to consider what we're asking these things to do, with areal density in the Tb/in^2 range, Ferrari-engine rotational speed and gnat's-crochet head flying height, all for under $100, it's a wonder that they work at all.
Yev from Backblaze here -> Absolutely correct. It's not a matter of if, but when. We designed our storage pods and backup system to account for hard drive losses, so we don't fear them. But drive deaths are inevitable. The main concern for us is cost, so we try to strike a fine balance between cheap drives and lower-failure rates. It's a shame those Hitachi's are hard to find now.
Using desktop drives in a 24/7 environment will kill them. Some might fare better than others, but frankly if you put your (or your customers') data on desktop drives then you're just asking for trouble.
And yes RAID can mitigate the data loss but it also exacerbates the failures as drives in a RAID array have a lot more work to do than those which are standalone.
So given that the drives are being misused I'm not sure what use these numbers are (although all the values calculated to 4sf are very nice to look at).
No it is not. As someone who has been (ab)using desktop drives in a 24/7 configuration both at home and in business for 17y now, I can say that most of them are perfectly fine if:
1. You cool them properly
2. Your case does not rattle them to death - suspension mounts, etc are recommended.
1 is what makes the difference between enterprise and consumer. Most enterprise are more resistant. 2 is really common across both types.
I have some MAIDs which have started their life as RAIDs (I usually retire disks from RAID into media MAIDs after 3 years) that are 7 old now, alive and kicking (Hitatchi DST and Maxtor all of them by the way). Notable exemption - the damn "duff cirrus logic" maxtor batch about 10 years ago. That was a total disaster regardless of Enterprise vs Consumer.
Ever since I started putting proper cooling on my hard drive cages I have had only one drive degrade (not even fail). It was surprise, surprise a WD EADS. 24x7 abuse of Samsung, Maxtor (exempting the duff cyrrus logic), Seagate, Hitachi - never had a problem with any of them (provided that they were cooled properly). Granted - as I do not do "IT proper" any more, my sample sizes are no longer big enough to yield proper statistical results. However, for whatever its worth it - those 30£ spent on a Silverstone cooled drive cage (or Icy dock) are money well spent.
> 1. You cool them properly
I always believed that, but there was an interesting data set from Google a few years back showing that, at least in the configuration they use, it was the hotter drives that kept going. The cooler ones failed earlier.
I have no idea if this is representative, nor do I have any explanation for it. But it made me think, anyway...
My experience reflect the same big lines with Hitachi being very very reliable and Seagate failing out.
I've had issues with Seagate to the point where I reached a 300% failure rate (3 rma per drive, in a year) on some of their 1tb drives.
Building very large array with "enterprise grade" drives is horrendously expensive. Not only that but many times the internals do not improve over consumer grade, only a few firmware tweaks.
My experience is similar, with nearly 300% failure rate during the warranty period on Seagates.
WD and Samsung exhibit more worrying "features", though, such as seemingly either lying about their reallocated sectors or reusing them; both possibilities are bad. (Observed by the pending sector counts disappearing on overwrite but reallocated sector counts not increasing from 0.)
Hitachis being least unreliable of the lot also tallies up with my own experience, although that is over low hundreds of drives rather than many thousands as per the study in the article.
Pending sectors are just that: PENDING. Does not mean they are waiting to be reallocated. It means they require further testing to see if they *should* be reallocated. If the firmware tests those sectors and finds them acceptable they are returned for use.
If you have frequent pending sectors being returned to the heap you may have a vibration issue where the drive is suffering "off track writes" making the data marginal to read back. Take a look at your fan mounts and thermal controls in the servers.
" you have frequent pending sectors being returned to the heap you may have a vibration issue where the drive is suffering "off track writes" making the data marginal to read back. Take a look at your fan mounts and thermal controls in the servers."
Seconded. I moved my home server's drives from all being crammed into one server case (20 drives) to proper cushioned mounts and regular events like pending errors (plus a bunch of other problems) all just "went away".
Yes, Seagate's and Western Digital's infant mortality rate are fantastically BAD.
When these drive makers went to a 1 year warranty it was a clear signal that they were making crap drives.
And those 1.5 GB drives were very bad. I had purchased 2 WD WD15EARS drives and one was DOA on arrival. The replacement drive I kept as replacement drive for my 4.5 TB Raid 5 external eSATA ( 4 x 1.5 TB) and was tested during it's warranty period and was found to have become defective. So a RMA on an RMA not good at all.
It appears although trying to make the drives ore energy efficent the constant stop starting of motors and read/write heads its just wears them out faster. I also wonder what being in a racked environment will have on the drive, as I assume there is no shock sensor or way of compensating against vibrations from all other devices.
All from the same batch or purchased over time? From the way you answered, it sounds like at least two batches, but I'd like to confirm.
And thanks to you and your company for not only gathering but publishing the data. I suspect you hope to get fewer drive defects from the manufacturers, but as a consumer of drives I find it very helpful.
I believe that was from one batch, but I do not know for sure. The reason I'm guessing one is because each of our pods contains 45 drives, so the sample size of 51 leads me to believe it was one pods worth.
And thank you for the kudos! We love being open. If we can get the HDD manufacturers to produce better products, and get some exposure for ourselves as a backup company out there, it's a win-win-win all around (third win being consumers who purchase these drives as well).
I believe the pod3 design doesn't provide enough vibration isolation (interdrive and drive-chassis) and the top clamp actually exacerbates issues by tightly coupling the drive to the connector. Gravity is more than sufficient to hold drives in place and Backblaze has the only drawer which uses a clamp.
Landing cushions plus floating sockets might make a big difference to failure rates.
The reported rates are far higher than I've seen with either Xyratec (f5404 sumo) or Nexsan (satabeast) drawers and we thrash our units pretty hard.
I know this isn't the smoking gun.. but since both seagate and wd started to their low power idle features, i started to notice drives failing and noticed many of the drives that the load/unload counts racking up fast (especially when used in Linux servers), i know many people got bitten by the WD Green drives when used in NAS boxes when first launched.
As a result any workstation or server i build i turn the feature at boot with hdparm or use wdidle, since then the failure rates have fallen to next to nothing with most just being the normal drives dying of old age at the 4+ year mark.
Why the hell cant they both make the drives with a jumper to disable the green feature...?
Although since Seagate started to implemented its shingle technology ive seen a couple of work stations suffer corruption when booting in seriously cold temperatures.. this isn't something i would expect any drive to really do and always advise clients to get the room above 15 degrees c before turning on anything electronic (due to the fact condensation forms below 10-13 deg c). But in both cases ive seen it happen, there was other systems close by with older drives (non shingled) that managed to boot without issues time and time again. In both cases the drives that had issues until the drive was give a couple of full formats (im guessing to re-align all the shingles properly).
I hope Seagate manage to get the production cost and bugs worked out with HAMR drives and start shipping so they can then drop the stupid Shingle rubbish. Its not like they have much faith in it themselves by all account, as they dropped the warranty period down no sooner as drives with shingling started to ship (or at least that's my guess at any rate).
Like other said after the report im going to seriously consider switching to Hitachi (something ive never considered), but this reports a real eye opener.
Bought 3x 3TB Seagate ST3000DM001 in Sept 2012. All three failed within 12 months and were replaced under warranty. 2 of the replacements have now failed and as the warranty is only honoured from the original purchase date they are now just expensive bricks. £200 I could have just set fire to and saved some time.
My 5 six yr old 750G Samsungs are all fine.
Sale of goods act says there's no limitation on warranty. If the replacement failed within a short time, it might be worth waving the sale of goods act and the words 'trading standards' at either seagate (if you went through their warranty process, your contract is now with them) or at the retailer if you returned the drives to them.
..... as I never used the low power eco WD's (always bet on black) until I threw a 2TB Red in the NAS box. The WD blacks in my main system have been hammered to hell over the years thanks to Steam but have only had an issue when they have been moved to older boxes as they were replaced. I just think WD's don't like being moved.
Extremely surprising were the Hitachi results as in the early 2000's we simply referred to them as Deadstars when they were still IBM owned. No wonder WD snapped them up.
I've always gone with consumer drives unless you need a small+quick array, in which case you probably go for SAS or flash these days. The I in RAID stands for inexpensive and using "enterprise" drives would double the cost.
Generally there's no difference, apart from when WD infamously changed the TLER flag on their consumer drives mid-way through a batch of drives a few years ago and people wondered why half their arrays kept spontaneously re-building.
BEWARE: "Enterprise" on a drive's name means "intended for use in raid arrays".
Specifically, they won't try to recover sectors as hard as a standalone drive does. (7 second timeout vs 180 second timeout).
This is a good thing in a raid and a bad thing if flying solo (The TLER setting moves a drive from "standalone" to "enterprise" mode)
I've personally seen the following types of drives fail within a week of being installed, with no warning:
ie, every manufacturer. I can also think of someone who will swear blind that X manufacturer is rubbish and that only Y can be trusted, for every combination of X and Y (or just read the comments above).
The plural of anecdote is not data.
My home VM server has a Seagate ST1500DM003 1.5TB disc as its OS and VM-shared data drive, a Samsung HD154UI 1.5TB disc for my NFS/SMB file server and a 6TB RAID5 array across 3x WD30ERFX 3TB discs (that Backblaze highlighted as the type they wished they had a lot more of).
Needless to say, it's the OS and VM-shared disc I'm most worried about...
I've seen Hitachis fail click of Death nasty, seen Seagate degrade in RAID 1, so I got WD everything for spinning storage:
1 80GB WB Blue was used for years in a Netbook, now in an IODD; no problems still.
1 250GB WB Blue on 24/7 in a Netbook for 2 to 3 years.
Six 2TB Greens in a ZFS NAS, one died, after over a year, with heavy use, big deal.
Six 3TB Reds in a ZFS NAS, one died in a month, replaced in warranty; no new failures in a year, with much heavier use!
I won't touch anything Hitachi ever again, Seagate showed that they lost the plot, Samsung are not good enough for spinning disks, but I like their SSDs in RAID1.
I expect all disks to fail, both spinning and SSD, so plan accordingly, preferably ZFS based, because you get more redundancy and much earlier warning of a failing disk.
Do people forget the electromechanical nature of drives?
Do users treat them appropriately? In the correct environment? Cooling? (not too cool by the way!)
Do they handle them correctly? ESD wise? Shock wise? Storage wise?
Do they use the right drive in the right application?
Based on the sob stories above (actually every time a HDD piece is published on el reg), the answer would have to be NO.
HDD technology is pretty amazing and no wonder it sometimes fails, especially when they are mistreated/misused, but I suppose users would never blame themselves for any drive failures, ever.
And that list in the article is very misleading - a useless statistically invalid dataset masquerading as useful information. You only have to look at the storageboxes the blogger is recommending and the inappropriate drives he is using for such chassis' to see the source is invalid.
El Reg should have spotted how poor the article is a mile away.
Not just "everything fails"
But also "a courier can kill even the most well-packaged device".
Seriously: Just about every drive I've had prematurely fail, arrived with some indication on the box that it'd been handled roughly in transit(*). I reckon distributors would do themselves quite a few favours by applying shock sensors to the outside of the packaging.
(*) Sometimes it's the supplier's fault. The package which arrived with 20 drives clanking around loose inside the box was returned unopened.
My observations over the last few years at work: (laptop drives mostly)
-You couldn't pay me to use a WD "Blue Label" drive--we've had a ridiculous failure rate.
-Seagate's "Momentus Thin" is a royal P.O.S. second in failures only to Seagate.
-Hitachi and Toshiba have been pretty reliable.
-We have very few Samsung drives, but I can only remember one ever failing--I was disappointed when Seagate bought their HD business.
-As far as really primordial drives lasting, very old Maxtors seem to run forever. Not so much for newer Maxtors. (and the brand is gone now anyway, assimilated by Seagate)
Glad to see Hitachi got their shit together--they used to be pretty bad.
Biting the hand that feeds IT © 1998–2019