Storage vendors have been sieging the large business market with solid state drive offerings for years — but cost and capacity restrictions have mostly kept them at the gate. Only recently has the technology advanced enough to to make SSD gear a plausible replacement for traditional disk storage. Take BitMicro for instance, …
10 grand for a 1.6TB? Or more?
Interesting - fits straight into an ILM strategy
So these disks will be crazy fast, will mean 3 tiers of storage SSD, FC traditional disk and ATA traditional disk. The s/w exists to move between these tiers, so you would prob only keep 10% of your data on the new SSD.
seems like as long as these drives are reasonably priced (as in no more than 4x traditional price) and the MTBF figures are good (would expect them to exceed traditional spinning disk, not being mechanical) they would be adopted by the storage vendors.
If you have to ask, you can't afford it....
Someone sell them to the BBC so i can get next weeks Mighty Boosh
Price and latency figures/.
95% of the performance problems I see on large databases these days are I/O related.
As for 400 IOs a second off of traditional disks - well maybe if you are doing sequential I/Os, but for random-access type OLTP then 150 is nearer the mark. However, the real gain is not the number of I/Os - it's reduced latency, and it's especially reduced latency on random reads (Enterprise class arrays cache writes, and techniques like rolling up writes at the backend and striping can balance the traffic).
What I'm hoping to hear is that random read times are reduced by at least 90% to sub-millisecond and hopefully better. Large businesses and organisations struggle with transactional performance on large CRM, Billing and similar systems.
I'm not sure that the video-streaming thing makes a lot of sense as a market though - that sort of requirement fits large disks fairly well, and something that is overwhelming dominated by reads works very well with large volatile caches and duplication.
On the cost thing it would be nice to know how this compares with array-based cache. Arrays with 100s of GB of cache (plus battery backup) are freely available, if not exactly common or cheap (and it uses a lot of power).
Once of the biggest problems with this new generation of SSDs is that the arrays that these might fit wouldn't be able to cope with large numbers of these devices - my experience of a lot of Enterprise arrays at the top end is that the throughput is often a lot less than the back-end disks are able to achive. This sort of stuff is going to open up possibilities, but will also present the likes of EMC, HDS & IBM with major engineering challenges.
Not so sure on the MTBF figures meself. These are flash memory based yes? AFAIK there's still a relatively low (compared to traditional storage) cycle limit on flash cells before they start to degrade to failure (was about 10,000 cycles).
If you went for a tiered solution with the most frequently updated data being on the SSDs (and, presumably being moved around and optimised by the storage management system), you might find them failing all too quickly.
The point of major service streaming is that you're streaming from the same storage device to 10 different destinations (each 5 minutes apart, say) and you need to sweep the heads backwards and forwards like bloody crazy (caching is unlikely to be useful, since the sort of server in question will have multi terrabytes attached and multiple massive files will be streamed in parallel) - so being able to stream from random access devices will not only be faster but also have a much lower "wear" on them.
EMC certainly (it's the only one that I've really played with) seem to have raw throughput issues - they are great for random stuff compared to normal JBOD solely as a result of their cache size/use.
Random read times should be the same as sequential - to a chunk of memory two locations are almost all equally accessable. OLTP and data warehousing should both fly if they've got SSD on the back end.
I'd also agree that the cost of the individual devices is relatively unimportant. The overhead of the fully redundant & managed chassis/framework to hold the devices is most likely to dwarf the cost of these - obviously until specific prices are quoted, that's just a likely not a definite.
With response to the read/write issue - I believe that it's up to about 100,000 read/write cycles now for most things. Which for the standard usage that people are looking at will exceed the MTBF of disks ;)
On the video feed, if these things are cost effective then you would have a point. However, the one Enterprise standard flash-based SSD I do know of which I've seen tested was over £150 per GB. Those sort of costs are higher than volatile memory (storing video files in main memory is much more efficient than pumping it through an I/O system). With efficient read-ahead algorithms, a single hard drive will easily be able to support five or more High Def video feeds (assuming efficient compressed).. For read-only data like that it is easy just to duplicate the data over multiple drives. That's very different to a transactional database where you need consistency of a highly dynamic enviornment.
With even 15K Enterprise drives costing less that £1.50 per GB then the SSDs will have a hard time cost-justifying something that is 100 times the unit cost if it's just a matter of running multiple drives for video feeds. Of course there are other costs like power, but even then the numbers don't look great. Also 230MBps isn't that high a rate for sequential access - it doesn't take many HDs to match that. By the time you get into several tens of High-Def video feeds then drive duplication will be required then, even for this option (by which time of course using system memory makes a lot of sense).
Of course we don't yet have pricing and there is a market for greatly improved transactional performance. It's certainly the future for high-performance storage, and that will no doubt include video streaming, but it will depend on price. The difference with the transactional database area is that there are technical limits imposed by the mechaniscal nature of the drives which cannot readily be overcome.
Seems like some kind of threaded, reply-to functionality might be useful.
Why 10,000 cycles is a lot
There's some magic called "write leveling". The short version is the flash controllers can do a good job of making sure if the memory lasts for around 10,000 write cycles, you can write almost 10,000 times the disk capacity to the device before you get into the lifespan failures. For most people, a 1.6 TB drive won't see 16 petabytes of data written to it very quickly. (I suppose it would take 5 years of writing 100 MB/s nonstop.)
Hopefully this isn't more "MTBF"-style theory that doesn't hold up in real life.