Cloud storage & legacy storage supplier vertical disintegration
This topic was created by Chris Mellor 1 .
Cloud storage & legacy storage supplier vertical disintegration
This topic is for comments to the notion that public cloud storage growth will cause legacy storage product sales to collapse with existing storage suppliers becoming cloud storage service operators, if they can, or cloud storage service component suppliers, They will have to vertically disintegrate.
two things at work here..... nothing to do with cloud.....
I think you are glueing together "cheap/ commodity" and cloud.
it is not.
you can build a (ceph, e.g.) storage cloud in-house or out-house.
so cheap has nothing to do with cloud onsite or offsite.
if it is terabytes that go in there, you cant have it in the offsite cloud. because you cannot change cloudvendor anymore once you reach the 10's of terabytes. which effectively locks you in until network speed goes up a factor 100 the least. please note that there is not a single cloudprovider that enables you to offload your stuff from their "service".
offloading it to offsite cloud will present you with cost far beyond what you will spend onsite.
only problem is that one needs to convince storage administrators that you can build a reliable storage service based on cheap commodity technology and proper storage software like ceph and others. which are free as in speech and beer.
I'm not convinced
I can see where you're coming from, but surely the tubes are still an issue? If you are generating petabytes of data you really need to store it close to where it is being created and used. So cloud storage is fine if you're also generating your data in the cloud, but not so good if you're generating it on your premises and you need a 2 foot wide fibre channel all the way from you to the cloud to transfer the stuff.
Okay, very big organisations can afford very big tubes, but they willl have to be very very big to support petabytes. And the same argument works at all levels - for a home or small business user with bog standard ADSL and 480kb upload cloud backup really isn't much use if you need to upload a few gig a day. Ditto for businesses.
I don't think the market for large on-site storage is going to completely disappear for a while.
Re: I'm not convinced
What happens when all the applications and processing are in the cloud? Pipes are an issue now, but the whole compute model has an opportunity for change too. Then its not pipe size but avaiability that becmes the problem.
latency. unless that customer is actually hosting applications in the cloud the latency to the cloud is still extremely high and bandwidth costs are still much higher than hosting your own storage.
And if they are hosting applications in the cloud most likely they are in for a world of hurt (relatively speaking). Of course some won't realize it because they don't know any better.
one of my data centers is 7 hops away from amazon's east coast firewalls for S3, 19ms on a gigabit link(Tier 1 ISP). Maybe if I'm lucky I can get 4MB/sec on a single stream connection to S3. I've talked to a couple different cloud storage appliance vendors, the ones that tier (more or less) to cloud storage, and so far none of the ones I have spoken to do any sort of fancy splitting of data files into smaller objects to transfer in parallel to improve throughput. But even if they did that only addresses throughput -- bandwidth is still quite expensive. 10GbE/8Gbps FC to your local storage is cheap .. 10GbE to the internet ..not so much.
The volume of data growth continues to vastly outstrip the decreasing cost of bandwidth over time.
it can make sense for cold data, not hot though, probably not even warm (other than acting as a backup of some kind). It's just too far away, takes too long to synchronize any more than trivial amounts of data.
don't get me started on the absolute shit that is amazon's EBS. Hello 1990s. It works fine as long as your not doing any reads or writes to it.
storing music, photos, video etc make a lot more sense for that kind of stuff(S3 object storage)
Amazon cloud(among others) is sort of like a roach motel as far as lock in goes for storage. Sure you can get your data out -- but it may take you months to do it(vs a local array is obviously an order of magnitude(s) faster). I know of one company for example that has hundreds of TB in S3 and they want to move out they just don't know how(with the least application impact). When they started they thought they were smart, but now they are paying the price and hurting bad (not only because of S3 but because they are "forced" to use other amazon services in order to perform work on the data because they can't effectively get it out). Each day that goes by the data set gets larger too.
But if the data is cold then it probably doesn't matter, won't get accessed much, so no big deal if it is slow. S3 makes good sense for that kind of stuff. And unlike most other amazon services they actually bill you based on what you use (rather than what you provision).
My data is my data. It's what I use to make money. I hoard it, like the life giver that it is. The very thought of trusting a third party with it abhors my inner businessman to the core.
On top of that, why on earth would I pay somebody MORE MONEY to take care of my data than it costs me to do it for myself? Negative cash-flow isn't in my business-plan.
Waiting for the perfect storm
to his these 'Cloud' vendors. Then we shall sort the men from the boys.
However if you value your data and want to keep Uncle Sam's greasy mits off it then don't even think of putting your stuff in a cloud what is either US Owned or has a presence in the US.
Anon otherwise there might be some Black choppers visiting me.
You'd better hope not Chris.
All those "commodity components" were developed by the legacy storage vendors you think are going to die.
Good luck with getting non-IT manufacturing companies (like Amazon, Google et al.) to spend R&D money on developing the next generation of storage technologies.
Similiarly good luck getting small startups to spend money developing it when they can never sell it because everything is stored on commodity components in the "Cloud".
If your predictions come true then innovation in storage will also die.
As a matter of interest, where did you get the idea that "Every byte stored in Amazon's cloud, or Azure, or the Googleplex, or Rackspace, is a byte not stored in a private VMAX, VNX, Isilon, FAS-whatever, VSP or HUS, StoreServ, StoreVirtual, StoreWhatever, V7000, XIV, or DS-whatever."?
Bleh, more Cloud UberHype, this read more like a paid advertisement than an objective piece.
Please start reading the Terms of Service for these cloud services, instead of just pimping them, and you'll find that there are some substantial problems actually worth writing about.
I'm taking a long view here and assuming these problems will be ironed out.
It's not just cloud
Love the article Chris, whether you are right of not, at least it creates conjecture and debate. I believe the cloud will have a big part to play in the rapidly evolving storage market, but the solutions have their challenges currently. I don't buy cost and security as inhibitors. This stuff is cheap (although their pricing models are pretty confusing currently) and their security is well architected, they dont have a business if it isn't! Their challenges are latency and issues around data management (lock in was an issue smeome described earlier).
However, in the short term i dont believe cloud will be the only challenge to legancy storage arrays. For a number of reasons - flash, hybrid, object and software defined storage based solutions will eat into their market as companies evdeavour to improve the speed, scale or cost reduction in storing and managing their data.
I think you're half right
For reasons I cant fully disclose I believe the underlying costs of using commodity "built to fail" components using the triple mirroring techniques typically found in the offerings used by hyperscale cloud providers could be improved upon significantly, while still providing good margins for providers of more efficient technology.
There are also some interesting economic / consumption models that also indicate that on-premise technology is and will sustainably be cheaper than using cloud providers, and the scale at which that happens is much smaller than most people would assume (I've seen some figures that that point is currently at an annual storage capacity spend of as little $250K). This depends of course on internal IT adopting similar state of the art automation techniques that the hyperscale cloud vendors use (a reasonably big "it depends" I might add), but it points to a likely situation where a mixed model for IT being a somewhat permanent feature of the datacenter landscape in much the same way that there is a mixture of permanent/contract/outsourced personelle, though the exact mix changes depending on current circumstance.
If the 250K capacity spend figure is accurate, then smaller scale "private storage" requirements where "non BigCloud" storage vendors will still have value. Indeed, I suspect there will be a number of "small cloud" vendors out there, especially as SaaS vendors grow sufficiently large to justify building their own infrastructure like Zynga did, and they will probably decide to innovate in areas outside of infrastructure, and will probably rely on technology developed by the existing storage vendors (or at least some of them).
Sure some applications that are completely homogenous like email will end up almost completely with the "BigCloud" hyperscale vendors, but after you cherry pick those ones out of the datacenter, there are literally thousands of custom applications left that IT managers still need to support, and its going to take a long time for those to get re-written/ported to a cloud platform. As a case in point there are plenty of applications out there that are not running on virtualised servers today, and probably never will be.
Eventually they will all get re-written, and when they do it is likely that those applications will initially be serviced on BigCloud IaaS platforms, but eventually many of these guys will probably migrate to "Small Cloud" IaaS vendors who can provide more finely tuned SLAs that allow those SaaS vendors to differntiate themselves from the copycats who are busy trying to disrupt them using exactly the same underlying infrastructure.
Lastly, demand for storage capacity is not limited by our ability to generate data, but rather by the costs and convenience of storing it. Build a sufficiently cheap and convenient way of storing data and it will get filled, AWS proved that nicely even though the amount they charge (based on 11c/GiB/Month = $4K+/TiB over 3years) is currently higher than equivalent costs of entry level low SLA storage from the likes of EMC, NetApp etc. The genius of AWS was allowing people to buy it by the GiB per month on demand, and make it good enough, and make it blindingly easy to consume.
Cloud isn't nearly so much about total price as it is about convenience, lower risk consumption models, and outsourcing the management of something people would rather not manage if they could get away with not having to. IF you solve the provisioning/managment problems and alter you business models a little (surprising little), even current technology from the major storage vendors can be as compelling, if not more so, than the cloud vendors.
One thing is certain though, the next 5 - 10 years will look NOTHING like the last ... but we've all known that for quite some time now ... and everyone in the storage industry will be living in "interesting times". Dont write the existing vendors off just yet, the party isn't over by a long shot, and the fat lady hasn't even decided which aria she's going to sing.
"As a matter of interest, where did you get the idea that "Every byte stored in Amazon's cloud, or Azure, or the Googleplex, or Rackspace, is a byte not stored in a private VMAX, VNX, Isilon, FAS-whatever, VSP or HUS, StoreServ, StoreVirtual, StoreWhatever, V7000, XIV, or DS-whatever."?
Because it seemed self-evident. Am I talking rubbish?
Not rubbish as such, I'm sure there are some companies who will put every copy of every byte they have into the hands of someone else (god help them). I'm equally certain that many companies won't, and that the same byte will exist locally and on someone elses cloud.
Storage vendors win all round, selling commodity parts to service providers, and arrays/disks/tapes to customers.
Meanwhile in the real world....
I work for a large storage vendor, so this is an interesting topic to me. Some of the other comments echo my thoughts, but having been in IT for a very long time (I remember the birth of the internet) and more recently specialising in storage, I am not sure that world + dog will willingly host their data in the clouds you describe.
Thoughts I have are:
- That performance of access to the data in the clouds you mention, will never meet the demands of large databases engines. Unless of course, we all get as much bandwidth as we want - to anything - cheaply or free. Utopia I suspect. Can't see it whilst telecomms companies can make money from it.
- Businesses serious about their data will not look to put it in someone else's domain. Much is propriety and lodging your crown jewels elsewhere is a tough act to justify. Factor in the auditing requirements and it looks tough responding to a Sarbanes-Oxley based investigation, only to be able to offer "Err, Amazon has the data - honest! - we just can't get it out of them very quickly".
- I think that business or private clouds will certainly prevail, although they could be run by partners/SIs. The point is that it keeps the data 'in house'.
- Innovation. The capabilities these Cloud offerings use came from the storage vendors who collectively throw a lot of money looking for better ways of handling the huge volumes of data, securely and at the best price point. I have personally seen the "use the cheapest simple kit" approach in action - it is a nightmare to manage, which increases risk and ultimately cost. Keeping your data in kit you own/lease is a much safer bet - even looking out so very far ahead.
Re: Meanwhile in the real world....
"The capabilities these Cloud offerings use came from the storage vendors who collectively throw a lot of money looking for better ways of handling the huge volumes of data..."
Umm... no. The technologies in use at Google and Amazon certainly did not come from the storage vendors. Read the papers on Map-Reduce, and Spanner. Paxos is pretty fundamental to these services, and no storage vendor did anything with this.
It's the classic Innovators Dilemma again. A new thing comes along with poorer margins and does a poorer job for the current customers, but is appealing to new customers. Eventually the new thing gets better and margins improve, and it kills the old thing.
Re: Meanwhile in the real world....
"That performance of access to the data in the clouds you mention, will never meet the demands of large databases engines."
You are assuming the cloud doesn't have big database engines. This is the "convergence" mention in the article. Big Table and Spanner manage data sets bigger than anyone else, other than possibly Amazon.
Database vendors are under just as much threat from cloud vendors, as the vendors of dumb non-converged storage like NetApp and EMC.
Agree with other comments - don't think it will happen as depicted. The storage environment will change significantly as we know but not just because of cloud. Anyway as we all know "cloud" is an ill defined concept and really just a marketing tool. If we stick to considering public clouds (Amazonian hordes et al) and hybrid clouds we are really talking storage outsourcing essentially and like outsourcing in general it will wax and wane but never dominate the business environment in general. So while it is currently on the rise it will come to an inflexion point when storage will start to be in-sourced again for reasons such as have already been mentioned by previous commentators.
It is notable that previous iterations of storage outsourcing (e.g Service Bureau in the 80s) rose up during a recession and waned again when the economy turned. Whilst storage outsourcing will probably stick this time round due to increased globalization and the web its worth noting that the majority public cloud storage currently seems to be taken by IaaS , PaaS etc. vendors e.g Dropbox not end users.
Private cloud which really just means "better managed infrastructure” is where the array makers will (and are already) going e.g. new versions of VMAX which allow self servicing of storage.
Not complete rubbish
" ... Because it seemed self-evident. Am I talking rubbish? ..."
With all due respect Mr Mellor, the statement below seems to strongly infer that none of the existing storage vendors provide any of the storage provisioned by large scale cloud providers. I may have misread the inference, but if it was deliberate, then while you may not be talking complete rubbish, you may be publishing terminological inexactetudes ... details are hard to give because you cant talk about a customers use of your equipment without their permission .. nonetheless .... :-)
"Every byte stored in Amazon's cloud, or Azure, or the Googleplex, or Rackspace, is a byte not stored in a private VMAX, VNX, Isilon, FAS-whatever, VSP or HUS, StoreServ, StoreVirtual, StoreWhatever, V7000, XIV, or DS-whatever."?
- Vid Hubble 'scope snaps 200,000-ton chunky crumble conundrum
- Updated + vids WHOA: Get a load of Asteroid DX110 JUST MISSING planet EARTH
- 10 years of Facebook Inside Facebook's engineering labs: Hardware heaven, HP hell – PICTURES
- Very fabric of space-time RIPPED apart in latest Hubble pic
- Massive new AIRSHIP to enter commercial service at British dirigible base