If you are currently in the market for a new storage array, take a moment to reflect on the significance. Your new storage array is probably the last one you buy in the form that we currently think of as 'a storage array'. The turning point, of course, is flash memory, but that technology is just so 'one dot oh' at the moment. …
Is James Morie new here? I almost gagged when I read:
"The game has changed forever, and everything must be re-evaluated. Ultimately, this is not just a drop-in replacement of any kind - it is a paradigm shift"
Please, someone tell Mr. Morie to tone down his hyperbole a bit when he's on El Reg. If we wanted such over-the-top teamspeak, we'd trawl and troll CNet.
Nice to see someone else pondering...
...the imponderables. I'm currently constructing a personal test-bed to experiment with different configurations here. The two aspects I'll be looking at, aside from sheer, raw MBps and IOps, will be what they've taken to calling BI (still the same thing I've been doing for forty years) and different aspects of virtualization (almost as long). Aside from the expense, my idea of fun.
I rather suspect that eventually we may see, for instance, drop in cards or even modules that the motherboard will treat as a type of RAM, really slooow RAM in some cases. [Still, even slow RAM with a fast processor and large cache(s) can kick ass.] I think I'm going to have to come up with my own system OS and software here since in my vision, I see Flash being the operating memory, Level 3 cache in the form of normal DRAM (the motherboard, processor, and RAM operate at the same data rate), Level 2 and Level 1 as normal. Basically, there is no "storage" per se except in the sense of backing up, porting, or uploaded elsewhere. Or you might just say there is no separation between memory and storage.
Not so fast
I do a lot of data recovery work and I would always advocate magnetic media if you want to have a chance of data recovery in the event of a failure. The MTBF of hard drives are better and even if everything is dead there is a decent chance of recovering the data from the platters in a clean room.
The best use of this type of storage is for caching and buffering.
Comparison to Violin Memory 3200 is fair since it is a Flash/SSD based storage, but comparison to TMS RamSan 440 is a bit unfair, since this is basically a SDRAM storage (and thus very fast) with on-the-fly backup to Flash. And only goes to 512GB per box.
Whilst I would certainly agree tha the current generation of storage array simply isn't up to getting the best out of the capabilities of an SSD as the internals and interconnects are simply not powerful enough, then I think there's good reason why storage arrays will not disappear and why a revolution in datacentre interconnect is going to happen any time soon.
The issue is around shared storage (rather than dedicated, which can be attached direct to server I/O buses). Sharing storage across multiple servers, whether it is by a shared array requires some form of network interconnect. It doesn't matter if that sharing is peer-to-peer, or central array. Something has to connect the storage services to the client systems. That network has to be capable of working over moderately long distances, perhaps 100metres or more, it has to be highly resilient, capable of dynamic reconfiguration without disrupting service, handle switching and routing of requests, and most importantly, must be widely supported - indeed at some levels, it has to be virtually ubiquitos. It also has to be very highly scalable.
The fact is that there aren't too many technologies to choose from for high performance shared network. You can forget wireless. It's far to slow and unpredictable. There are a few short distance interconnect technologies, such as FireWire, USB, SATA and so on that are moderately fast, but really not significantly more so than the main data centre interconnects which are, of course, Ethernet and Fibrechannel, Ficon and, or course, the converged Ethernet/FC standards (with a nod to Infiniband, although). There have been a few proprietary clustrer interconnect technologies, but these all have very limited support with nothing like open standards.
On those protocls above, we have real networks in the 10Gbps region available, with movement towards 100Gbps. In truth, there aren't many things out there which can properly support data rates like the latter. 100Gbps may only be the combined throughput of a dozen enterprise SSDs, but there's precious little out there that could deal with that amount of data so quickly as a single data source (aggregate bandwidth is another issue entirely).
I would argue that for most mainstream data centre applications, if the I/O latency dropped from the typical 6-8ms you might see for a random read on an array with entrerprise disks to the roughly 0.5ms you might expect to see with SSDs and a well designed modern I/O network and server technology, then that 100-fold improvement in latency will shift the bottleneck somewhere else. That's quite probably in processing, lock contentions or any number of other places.
Now there is no doubt that storage array suppliers are going to have to do something about the total throughput capability of their of the way they handle storage so that they aren't the bottleneck (even top end arrays can struggle once then get into IOP rates in the several 100s of thousands a second, or data rates in the 10Gigabyte per second region), However, I think for most data centre apps, an increase in capability of an order of magnitude would be ample, and we aren't in the territory where there would be much benefit in increases of 100 times that.
Of course there will always be the hyper-scale configs, like Google or (maybe) clouds, but those are also exercises in software engineering and very few organisationshave the need or resources for that.
I for one would be very suprised if, in 10 years time, the data centre storage interconnect didn't still rely on evolved versions of Ethernet and FibreChannel connecting to centralised storage facilities which may implemented a bit differently, but are still recognisable arrays. Datacentres evolve, they don't suddenly move to a new generation. The old has to work with the new. Arrays of some sort are here to stay.
I believe that there is a different issue here that may be changed by solid state memory.
My thoughts are that it is an addressing issue. Currently, if you think about it, data in current persistent media is accessed via a filesystem, indirected to some form of adapter, across some form of interlink one or more times, then to a disk.
All of these levels provide addressing information of one kind or another, that may or may not be abstracted one or more times. This is required because of the inherent limitations on the size of disks, the number of disks per device bus, and the number of device and interlinks available. Over and over, this has to be re-worked as disks sizes reach the next barrier. This is expensive, time consuming and slows down what can be done.
With solid state memory, it is in theory possible to implement a block or even a byte addresses space as large as the size of your address. Lets allocate 256 bit addressing, giving a 10 to the 77th power space, which should be enough for anybody (famous repeated last words, maybe make it 512 bits). We don't have to make this all physically addressable immediately. Expose this as a global address space to ALL of your systems. Call this a Storage Bus Address (SBA - I claim trademark and any copyright and patent rights over the name and concepts). Allow SBA virtual mapping so that you can expose parts of your global filestore to individul systems, and maybe allow slow interconnects to use fewer address lines.
Put the resilience in the managing device (two or three times mirror with multi-bit error correction), make the memory hot-swappable in manageable chunks. Add secure page or 'chunk (of address space)' level access security using a global name space and cryptographic keys to protect one systems data from another. Add in some geographical mirroring at any level you like for protection.
Once you have done this, you can abstract the interconnect between your servers in any way you like, provided that you maintain the access semantics. Make it closely coupled (at internal bus speeds), or distance coupled depending on the access speed you require.
Change all the OS's to implement this large space addressing for their persistent store (it's easier with some, like Plan 9 and IBM i, than others), initially as a filesystem, but ultimately as a flat address space in later incarnations of the OS. This could even be added into the processor address space, but I think that would require more changes in system and OS design.
I think that the revolution will come when persistent storage is addressed like this, and it could be done fairly easily, but would require industry agreement. This may be what prevents it.
This is me blue-sky dreaming, but I don't see why it can't happen.
Exactly. This is part of the point I was trying to make in very few words in the article - the big change needs to be in hardware AND software.
It is amazing how many ill informed people think that flash is the future.
The underlying architecture is no less flawed that rotational disk.
In the disk array world there is a whole lot of technology that is required to make sure your data is safe and sound and this, for the foreseeable future, will not go away and will in most cases be the bottleneck. As much as memory technology improves, its ability to meet growing customer data requirements will always lag behind. The cost per gig will also make broad adoption difficult.
Flash based storage has a place within a tiered environment but is a long, long way from ruling the datacenter.
The point solutions from TMS and others simply lack the pedigree and many other features that people need in the real world.
Deity be praised! The end of disk media as we know it!
First predicted in the the mid 1970's.
I don't make a living at this sort of thing but I'll bet the HDD mfgs can lop a fair bit off the price of the balance of system (all the drive electronics) and equilibrium will be restored.
People have been predicting flash price parity for years and it's no closer now than it was 5 years ago. The only place where it even drops below an order of magnitude cost premium is in enterprise and that's just because the EMC's and NetApp's charge 5x what they buy the drives for. The reality is far more likely to be tier 0 flash backed by cheap bit buckets.
Latency will kill SAN, or the DAS strikes back
The latency of a SSD is 0.2ms, and will improve when we stop using legacy interfaces like SATA or SAS and adopt memory-interconnect ones like PCIe, and when NAND is replaced by faster technologies like FeRAM or phase-change.
At these speeds, the latency of the FC/FCoE/whatever switch, and queueing delay for an array controller, will dominate the latency of the devices themselves. SAN arrays will find themselves at a considerable performance disadvantage. Ultimately, storage will migrate back to be directly attached to the CPU, as close as possible to high-bandwidth, low-latency memory interconnects, and the database software will have to handle the distribution and manageability formerly offered by the SAN. In other words, the likes of FusionIO will destroy the high-performance storage market for existing SAN vendors. No wonder 3Par was in a hurry to sell itself, as they had the weakest SSD story of all the major vendors.
SAN arrays will still exist, but will be relegated to the role tape libraries occupy in the data center - cheap but slow bulk storage for backup and archiving purposes. I hope we have rid ourselves of the Rube Goldberg contraption that virtual tape libraries are by then.
Mechanical storage companies.
They will be toast no matter what. now that have the advantage of a very difficult to manufacture machine with specialized processes and manufacturing.
meanwhile just about anyone can buy memory chips and a controller and put them on a circuit board. their advantage will be completely gone.
Cost, cost, cost
With 500GB manufactured for less than $10 and installed in a computer readable device (ther drive) for less than $40, they are far from dead. But several plus a controller chip and put a TB on your own board for $100 and then go buy a TB of solid state. The drive is just a fancy chip with some stuff inside. They could put it in an IC package for all I care.
New strorage paradigm ...
I think that the future storage will be of two types :
1 - Direct into PCI storage modules, allowing the speed of nearly direct system bus interface, as a substitute of actual disk drives. After all it is a waste of resources having to serialize your data requests, send them on a wire, to get them decoded to a memory, send the data back, and then copied to antoher memory.
Just interface the flash memory directly. Your processor will see its lower nGb of dram, and then a hige space of nGb of directly addressable flash memory. Get rid of the layers between them.
(Of course the devil is in the details)
2 - Add ons, of lower speed and/or bigger capacity and/or lower cost, as external modules, plugged in a bus a la usb or sd cards, and removable.
Especially the 1B idea of making solid-state storage into a form of directly-addressable NVRAM. SInce there is already forms of memory-mapped file addressing, this would work as a logical extension.
Especially in 64-bit computing, you could probably set aside a chunk of address space and declare it to be memory-mapped NVRAM (perhaps the region defined as those having bit #32 on--it is currently in a no-allocate zone to allow for more streamlined memory implementation) like you do for the hardware.
It's a long while since we saw doubling every 18months. Which is what Moore's law used to be quoted as.
We are close to the limits of chippery, a Bigger disk or more platters is a very much cheaper solution than more chips or a bigger chip.
I don't think HDDs are going away. Especially if the figure out the next step in density, which looks more solvable than 14nm Flash.
Links have been too slow for a good while now, at least ten years, this is why we have link aggregation. If you only need failover redundancy, stick in two fibres, if you need more bandwidth stick in a load of links use multipathing software (good multipathing software, mind) and you can extend your bandwidth/IOPS pretty much as you require. Obviously you'll need sufficient disk at the back end,
@jackofshadows, sounds like you are reinventing some of the Cray's OS -- from what I've heard the EARLY Cray OS didn't even support virtual memory (because it'd slow tihngs down a few percent), and once customers demanded it, they went full out -- disk, RAM, and cache (if any -- some Crays just had cache-speed RAM...) was one large pool, the least recently used stuff was just stuck into RAM and cache.
What will happen? I really don't even want to try to predict it. Storage attached directly to the PCI Express bus, and eliminating filesystems and such, improve performance but there is a lot of pressure to make things somewhat compatible (i.e. hook the SSD disk up to a cable or enclosure, and keep treating it as a disk-like device). These are in direct opposition and so I don't know what the result will be -- often times the best or fastest is sacrificed because the market just doesn't really want it.
Disks are dead (but not just yet)
Having worked in the disk industry for over 40 years I am a dedicated believer that disks are the low cost storage. It is all about cost per bit. (That shows my age - how about cost per G?). The connection issue could be addressed with either technology, solid state or mechanical drive. The problem is not the lowering cost of the IC, the problem is that the disk technology is at a cross-roads. The magnetic bit is so small that the old mechanical methods of staying on track, including the improved embedded servo techniques, double actuator heads, etc. need the next major technological step to increase density. The current disk industry argument is - should they move to HAMR (a Seagate term for Heat Assisted Magnetic Recording) or should they move to a "patterned media". Both are being explored by WD, Seagate, HGST, Toshiba, Samsung, etc. It currently appears that "patterned media" is winning the technology fight (several forms but for this comment, track or bit pattern does not matter).
What does matter is that the disk industry, in the implementation of the new technique will have to adapt semiconductor lithography techniques and that is disastrous news. Disks are produced by the millions at an extremely low cost per unit. Density increases are made in steps. First the head then the media then the channel then the servo, and around you go. When the disk companies are forced to adapt semiconductor techniques the density jumps will be made just as they are in semiconductors and the costs will sky-rocket. Each generation will require a new lithography and a complete development, no more incremental improvements.
As the disk production costs move up and the semiconductor costs move down we will see a tipping point in the famous "cost per (insert favorite memory unit)" equation and semiconductors will replace spinning disks. The LOL factor is - just as predicted back in 1970.
More Moore's Law
Actually Gordon predicted a doubling EVERY YEAR, "for the next ten years" back in the mid-60's. I have witnessed the "Law" being stretched to 18 months and now 24 months depending on the whim of the author. Perhaps it is time that we return to the original prediction but state it as 18 months = Moore's Law 1.5 and 24 months = Moore's Law 2.0, etc., etc. At least that will give us a benchmark rather than leaving it to how the writer feels at the time.
What will happen to OS-es ?
Operating systems were designed to make use of delays caused by I/O latencies. If these latencies disappear, then we end up with too much overhead in the OSes that are no longer needed; no more complicated scheduler, no more asynchronous processing and what not.
Not going away anytime soon.
They're just shrinking. Even with a move to solid-state I/O such as flash there will still be a considerable (in terms of the CPU) delay when accessing the I/O. This would simply call for a few tweaks and revisions in O/S time management.