There has been much ado about 2.5-inch, small form-factor (SFF) hard disk drives of late and how they are better than the larger 3.5-inch drives. Does size matter? Several storage array vendors have started using SFF drives in their arrays. The VSP from Hitachi Data Systems is one such drive. HP has OEMed this from HDS's parent …
It's not the size but how you use it that matters
The argument that there is no long term need for 2.5-inch drives is interesting but potentially one could argue that there is no long term need for 3.5-inch drives.
There is no inherent technical difference between a 3.5-inch and 2.5-inch disk drive as they use the same media and mechanical design and this shows in the fact that performance numbers are pretty comparable between the two form factors assuming similar rotational speeds. The biggest issue is power and since with a 2.5-inch platter you have less platter and as a result less physical mass that need to be moved and additionally with the smaller form factor you generally have fewer platters it quickly becomes obvious why a 2.5-inch drive consumes about half the power of a 3.5-inch one.
Right now there is a 3x to 4x capacity difference between a 3.5-inch disk drive and a 2.5-inch drive (600GB vs 2TB). But this ignores the fact that due to the smaller form factor you can fit twice the number of 2.5-inch drives in the same space as a 3.5-inch drive. This brings it closer to 1..2TB to 2TB in the same amount of rack space with an overall power and cooling footprint that is similar but with an idle and active power/cooling rating that will be lower with 2.5-inch compared to 3.5-inch drives.
And with 2x spindles you get 2x interface connections, 2x queues, 2x read/write mechanisms, and potentially 2x the IOPs and MB/s.
The smaller sized drives even means that when a disk drive does fail you have less data to recover and recovery times are reduced. This all should stay relative with the release this year of 900GB 2.5-inch drives and 3TB 3.5-inch drives.
One could reasonably argue that outside of large media repositories and virtual tape libraries that the larger 3.5-inch disk drives may no longer be relevant.
An interesting comment is that HDDs fail more than SSDs because they are mechanical in nature and I would say that this has not been definitely shown to be the case as of yet. A recent story posted here on the The Register reported that analysis of one reseller's return rates showed that SSD return rates were about the same as that of HDD. Part of this of course is attributable to the relative maturities of the two technologies and SSD reliability is bound to improve.
My personal experience with HDD failures especially in enterprise level storage arrays is that frequently the disk that has been failed by the array is actually still quite serviceable and I have redeployed many of them to other less demanding situations without any issues. I suspect that the main reason for the high failure rate in storage arrays is that in a raid stripe or volume group that one slow drive can effect the performance of the other drives and storage vendors will fail these drives for performance balancing reasons.
Another point to keep in mind with the use of SSDs is that currently due to their considerably higher cost there is a tendency to employ them as single disks without any RAID level assigned to them making them effectively RAID-0 or to employ them for the most demanding business critical environments. This means that when a SSD does fail it is likely to have a higher than average impact to operations so one needs to put in place proper expectations and recovery plans when using SSDs.
This is actually a pretty good time for storage administrators as there are plenty of choices and options to meet all kinds of requirements. The hard part is that they need to determine, the capacity, performance and cost requirements to be able to make an optimum decision in what they purchase and where they deploy it.
"An interesting comment is that HDDs fail more than SSDs because they are mechanical in nature and I would say that this has not been definitely shown to be the case as of yet. A recent story posted here on the The Register reported that analysis of one reseller's return rates showed that SSD return rates were about the same as that of HDD. Part of this of course is attributable to the relative maturities of the two technologies and SSD reliability is bound to improve."
I *think* that article was regarding consumer HDDs and SSDs (haha, unless there are folks other than poor Trevor having to slap WD 2TB Green drives into servers - I have never seen one in a data center before at least). You're spot on regarding the relative immaturity of SSDs but there is a significant difference between consumer (multi-level) and enterprise (single level) flash. Of course, that may or may not imply a difference between reliability for consumer and enterprise SSD - who knows? I'd be curious to see more on it at least
ssd failure report
Yep, it was a consumer oriented report as I recall but then it is very hard to actually get any sort of enterprise level data since those products are delivered via large companies such as HP, IBM, EMC, and etc and they tend to keep details of that nature in house. Likewise trying to get details on return rates out the suppliers to these companies is equally difficult.
While the exact details between consumer and enterprise are dramatically different the overall trends in pricing and reliability are similar enough to draw rough comparisons based on them.
Say what you want about Western Digital drives in general - and I've no good words for the Velociraptors - but I'll be damned if the RE4 Green Power 2TB drives aren't solid gear. Slow as sin...but they store a great many bits very reliably.
Bad (terrible!) idea as primary storage. Not remotely half bad as archival storage or in a MAID.
Listen to this man.
"My personal experience with HDD failures especially in enterprise level storage arrays is that frequently the disk that has been failed by the array is actually still quite serviceable and I have redeployed many of them to other less demanding situations without any issues. I suspect that the main reason for the high failure rate in storage arrays is that in a raid stripe or volume group that one slow drive can effect the performance of the other drives and storage vendors will fail these drives for performance balancing reasons."
Everyone listen to this man: he knows of what he speaks. This quote is Truth spoken freely. I should also point out that in many cases disks which consistently fail our of a RAID will pass vendor diagnostics as they are mechanically and electrically sound...they simply have remapped critical sectors as failed such that the disks are that msec slower than all the others in the array.
This indeed is why the TLER bug on the Velociraptors is such a pain: there is nothing wrong with the drives themselves...but they stop responding due to a firmware issue that ends up dropping a perfectly valid drive from the array.
I don't think the Velociraptor TLER is a bug. It's a design choice. RAID hardware can manage errors but cannot tolerate allowing the drive to run retry processing to recover marginal data. Slow drives are problematic for sure. They are on the edge of failure because they have to retry and retry (as many as 200 times) before they give up. But the TLER limits the retry timing expecting the controller hardware to respond and handle the error.
You have to know your workload. Some use cases are more random, some more sequential, some read heavy, some write heavy, and some (like your average file server) aren't really storage-performance bound at all.
The problem is trying to apply a one-size-fits-all approach like one client I have who went around yanking local drives from servers and migrating everything but the OS to SAN. Don't get me wrong, it's a sexy proposition for management... and sounds nice and simple but you run into big problems when your standard approach/solution doesn't fit a particular use case.
You get what you pay for after all, but I'd add in one factor that was missed in the blurb... when determining a storage architecture you need to asses price, performance, reliability *AND* manageability because these days you can't have a real storage conversation without discussing SAN and NAS (sorry if that brings up a can o' worms and derails the conversation here... but without it this is merely an academic discussion).
I, for one, am completely enthused about the progress in the SSD market but there is a significant (huge) spectrum gap in not only price per GB, but performance per GB. Everyone understands the price issue (last check I think both Dell and HP are listing SSDs at $10-20+/GB) but the performance density (overkill) pretty much kills the business case. SSD manufacturers LOVE touting the performance numbers but at the end of the day there are, IME, few use cases that actually need, for example (ref: Micron P300 SLC), 16k IOPS for 200GB of data - that's an average of 80 IOPS/GB where some of the best SAS HDDs I've seen (which are, for the record, 3.5" and not 2.5") top out at maybe 180 IOPS (which, IIRC, works out to around .9 IOPS/GB). To put it another way, for the particular use case I'm most familiar with we look for roughly 1.8TB of data storage and 1200 IOPS *for a partition* which puts us at a performance density requirement of around .7 IOPS/GB and we're considered IOP hogs by most of the storage guys we talk to. Keep in mind, however, that I'm referencing the best HDD performance we could find (again, IIRC, it was a ~200 GB, 3.5" SAS drive) - the average HDD performance will be significantly less.
Sorry if that began to ramble, but the point is that even though SSD prices have come down significantly HDD prices have been dropping too and for most situations the CBA just doesn't make sense. For SSDs to really hit mainstream the rest of the server hardware and software stack need to be able to exploit it and give demonstrable benefits (more workload or seats per server, for example) but today's reality (for me at least) is that removing the HDD performance bottleneck (at a 10+x premium I might add) will not get me an equivalent improvement in server performance or workload capacity.
Anyway, here are my rules of thumb:
SSD for performance at any cost, not quite there for most use cases (yet?)
SAS 3.5" for the best overall combination of performance and reliability (i.e. most enterprise situations)
SAS 2.5" for when you don't need the incremental performance improvement of 3.5" and/or are trying to cram more spindles/capacity into a particular space (i.e. a perfectly acceptable alternative to SAS 3.5" but not my default)
SATA when you need to go cheap at the expense of some performance and reliability
SAN/NAS when you know you'll need to expand on a somewhat regular basis or at least have a significant risk of it (although I know some midrange storage folks that swear it can be killer for heavy sequential loads)
That's my $0.02 at least.
Actually, it's about reliability
It's all about reliability per TB. 3.5in disks are getting sufficiently big that probability of failure per TB is pushing them into the realm of unviable. 2.5in disks are smaller (Duh!), and thus given a fixed error rate (and the unrecoverable error rate between 3.5 and 2.5in disks is the same), the smaller disks will suffer fewer sector failures.
This means that 2.5in disks they can be used more effectively and safely in RAID 1/5/6.
Same arguments back in the 5.25" vs. 3.5" days...
And with the same result. 3.5" drives will be pretty scarce in a few years, not just in storage arrays but in PCs you buy and in consumer electronics. 2.5" drives take up less volume per gigabyte and use less power per gigabyte, and those are the only figures of merit that matter anymore for rotating media.
IOPS in rotating media are now pretty much irrelevant in the face of the orders of magnitude increase you get from SSDs, versus the tiny gains you get from spinning faster, using more power to get faster seeks, or using more efficient interfaces like SAS or FC.
I expect by the next generation of arrays we'll see only two types of storage, SSDs for the obvious I/O benefits, and 2.5" SATA drives for bulk storage. SAS and FC interfaces will disappear. 7200 rpm will be as fast as it gets, and I wouldn't be surprised if we don't even step back to 5400 rpm to get slightly better density and power characteristics. After all, all that hot data is going be living on the SSDs.
SAS or SATA?
I pretty much agree except in regards to the observation about SATA becoming dominant. Fibre Channel for disks is dead and none of the current drive manufacturers have any new drives coming out with FC interfaces. This does not mean that FC host interfaces are dead since being able to have 125 meter length cabling (or longer with long wave SFPs) and a pretty mature switching environment is still very nice to have. That leaves us with SAS and SATA and the cost difference between the two technologies is pretty minimal but technology and feature wise SATA just really doesn't compare well with SAS. I expect SATA to stick around for a bit but if the drive vendors narrow the cost delta between a SAS and SATA interface much more I would expect them to drop SATA as just not being worth the hassle of having two different interfaces with one offering no performance or feature advantage and limited to no cost advantage.
Personally, I am hoping for a new storage technology to come to the front to replace both of them and there are a lot of interesting ones in development at this time.
2.5 vs 3.5 showdown
For the individual user the main governance is generally price rather than efficiency. For the business the equation has to be tempered by reliability and timeliness. Thus something that is time-tested is often the best choice for the business because of reliability, while the choice of the consumer is more limited.
To come up with an overall equation of usefulness seems a great but unlikely aspiration.
I'm sorry but I fail to see the point of this article. (Hence the flame for El Reg. Sorry El Reg.)
3.5" vs 2.5" form factor? Hmmm let me think...
The benefit of a 3.5" disk is density per drive. That is that I can get a cheap 2 TB drive (3TB drives are popping up.) So for systems where I'm limited by the number of drives per box, and I don't care about the number of spindles ... 3.5" SATA drives make sense. (Read: More $$$ per TB when you buy 2.5" drives.)
Then the author points out... SSDs are the fastest thing out there. Funny how they fit them in to 2.5" drive devices. But then the author points out the obvious. SSDs are *expensive* and on a large scale, they are cost prohibitive.
So what's a drive array maker to do? Hmmm. Combine 2.5" hard drives, and 2.5" SSDs in the same chassis? Wow! Simply Brilliant. Definitely worth writing an article on... Take two devices that have the same form factor and put them in the same drive array so we can offer limited fast storage for the hot stuff and cheaper (slower) access for the rest of the kit.
For database stuff, its RAID 10 not RAID 5 so if a drive fails, you pop one out, and put a new one in. Its not as 'costly' to repair a raid 10 disk failure as it is to repair raid 5. ;-)
So I have to ask... how much was the El Reg reporter paid to write a fluff piece on storage arrays? Not from El Reg, but Hitachi?
Sorry El Reg, boring and bad writing.
But what do I know? Its not like I support Database systems that use large arrays. Or clusters of 'commodity' hardware in Hadoop/Hbase environments? Oh wait I do....