The product is going to fly...
Will it fly the way Whiptail did when it blew a $400M hole in the ground after Cisco acquired it?
26 posts • joined 24 Apr 2013
Will it fly the way Whiptail did when it blew a $400M hole in the ground after Cisco acquired it?
Well, it is certainly fun to rummage through the server dust bin and cobble together a storage cluster based on open source software, but who would seriously consider using it as a production storage tier in their organization?
You can get a very low TCA if you build your own white box storage servers or turn to ODMs like QCT and Supermidcro. Google started out building all of its servers as cheaply as possible by doing it themselves. There is really no reason to start with junk servers unless you need to prove the concept before you get the funding you need to do it right.
DIY open source storage software like Ceph is not a walk in the park if you don't have a computer science department nearby. Ceph has a complex, non-P2P architecture with no built-in capability to do charge-back, QoS or reporting...stuff that people are interested in having. Upgrading Ceph has also been difficult. Maybe Red Hat has made some progress along these lines with the release of Red Hat Ceph Storage 2 this past June.
I do agree the determining the TCA for a capacity storage project is relatively easy and coming up with a fully burdened TCO is more difficult, but not impossible. Personally, a single storage administrator should be able to manage 10PB of objecgt-based storage, assuming the cluster is not made of junk servers needing constant attention.
Well, it was exciting to see the video of Mr. James Hughes from Seagate present and demo a Seagate Kinetic HDD at the RICON West Conference (Basho Riak) in San Francisco in October, 2013. Every OBS software vendor had a comment about Kinetic...a few were interested and willing to investigate it, while others were not convinced that Kinetic worked any better than what they were already doing with their OBS clusters.
So, here we are almost three years later and no production quality deployments of Kinetic at scale. Maybe OpenIO is on to something, but it looks more like an engineering project right now. Maybe it will be "insanely great" and maybe it won't.
In the meantime, every OBS software vendor wants to deliver their OBS software to solve current customer storage and data management problems at scale. No customer is going to wait another couple of years to see if Kinetic works as originally conceived by Mr. Hughes at Seagate. In fact, most customers using OBS software don't really care much about the hardware that does the storage. What they care about more is the cost of the storage and the ecosystem of solutions they can choose from to solve their data storage and management problems. From this perspective it is all about S3 and using OBS software on commodity hardware, and deploying it with the fewest headaches possible. A tricked-out Kinetic HDD without the OBS application software is useless. It is all about the OBS software.
Well, here is the rub from the article. "This S3 Connector uses the S3 Server, a Scality-originated open source S3-compatible API server, available on Github." First, Scality does not use a S3-compliant API as its native RESTful API in Ring. It uses a S3 Connector. Second, how many of the 51 AWS S3 API operations does Scality actually support in its S3 Connector? In other words, if you pick a number of S3 apps at random from the hundreds of AWS S3 solutions available, can you point them at a Scalilty Ring cluster and have them all work without modification? AWS S3 is the de facto standard for object storage, but S3 compliance comes in degrees. AFAIK, only Cloudian has a native S3-compliant API that comes the closest to being fully compatible with the AWS S3 API.
Well, every "trick" is being deployed to increase HDD capacity...helium, SMR and soon HAMR and everything is designed to run right at the edge of failure in order to keep the price as low as possible. The largest capacity HDDs will likely find their best application in object storage environments where their failures can be better managed, but not in desktop or traditional server RAID storage environments where HDD failures at this size would likely be more catastrophic and/or time-consuming to correct. And then there is the rapid increase in SSD capacity. With 15TB SSDs currently becoming available, SSDs have won the capacity race. HDDs still have a cost advantage, but it won't last for much longer. HDD manufactures will likely stop building HDDs sometime between 2020 and 2025. So spin them while you can.
Well, when Mr. James Hughes from Seagate publicly demonstrated a Kinetic drive at Basho's conference in San Francisco in October 2013, everyone was suitably impressed with what Mr. Hughes and his team had achieved. Kinetic eliminated the server with its disk controller and POSIX layers. It relied on the Kinetic API and its libraries to use the Kinetic drive as a key/value object store combined with the on-board drive firmware, and dual 1Gb Ethernet interfaces that used the standard SATA/SAS connectors. FYI, the typical HDD has between 1M and 2M lines of code on it.
Mr. Hughes was on a personal mission to get rid of POSIX and all of the other "busy work" disk drives get involved with when storing data. Kinetic sounded like a breakthrough in data storage and Seagate was making available small-scale test/dev hardware kits so the app/dev crowd could get started. It turned out that the software quality coming from Seagate was not yet production ready.
Most every object-based storage software vendor made an initial evaluation of Kinetic. Some liked it and planned to develop for it, and some didn't think it offered a significant advantage over what they were already doing. Scality and SwiftStack were on-board with Kinetic. Caringo and Cloudian were not. In 2015 the Linux Foundation started its Kinetic Open Storage Platform with Cleversafe, Scality, Red Hat and Net App numbered among its members. But has Mr. Evans noted, there doesn't appear to be much momentum behind the project almost three years later. The jury is still out on Kinetic, but its prospects may be diminishing as the years go by.
Well, if it hadn't been for the success of SSDs, Google's interest in producing a different kind of HDD might have some merit. HDD capacity has gone from 1TB to 10TB in nine years. The price for a 1TB HDD has fallen from $0.32/GB to about $0.05/GB. SDD capacity has gone from 1TB to 15TB in three years. The price for a 1TB SSD has fallen from $0.60/GB to $0.30/GB. HDDs still have a price per GB advantage, but SSDs has won when it comes to capacity. As production of HDDs declines, the price per GB will not fall much lower. As production of SSDs increases, the price per GB will continue to decline. Seems very likely that by 2025 HDD production will end. SSDs or their successor will have taken over in both capacity and price per GB. HDDs will fight on with SMR for specialized storage and HAMR, if it ever proves commercially workable, but it is a losing battle for HDDs. Not bad though when you consider that the first rotating magnetic disk storage device was commercially sold by IBM in 1956. Sixty years was a good run.
Well, the author is correct that Caringo, Ceph and Ctera were not started yesterday. Caringo's founders developed what became EMC Centera before creating their own storage software in 2005 called CAStor, now re-invented as Swarm. With 10+ years in business and probably more customers than any other object storage software vendor, Caringo has survived without being bought and is looking to tune-up its approach to the object storage market with improved usability, native search and better file level integration with Microsoft Windows Servers and NetApp filers. All good and necessary things to do.
Ceph was part of Dr. Sage Weil's PhD research and dissertation back in 2007. Ceph is open source software, but its commercial sponsor, InkTank, was purchased two years ago by Red Hat for $175M. It combines file, block and object storage, which sets it apart from a pure object storage environment. Its commercial future now resides with Red Hat, which also purchased Gluster (GlusterFS), which is a clustered file system. Ceph is also popular among OpenStack enthusiasts.
Ctera was founded in 2009 and offers on-premises appliances for backup and file sync-and-share through a backend Ctera portal connection to an object store or other types of storage. Ctera has been commercially successful both in the US and Europe in the SMB and enterprise market as well as the service provider space.
The author doesn't break a lot of new ground here. The market for PB+ scale object storage customers is about 20K worldwide according to Scality's Mr. Jerome Lecat. Scality recently received a $10M investment from HP, which could be seen as a prelude to its acquisition by HP in order to counter IBM's purchase of Cleversafe for $1.3B last year.
The company that Mr. Signoretti did not mention, although he is familiar with it, is Cloudian, which does address the "scale down" and well as the "scale out" aspect of the object storage market with their HyperStore software and appliances. Cloudian's crown jewel is its full compliance with the AWS S3 API, which means that any third party solution or appliance that works with S3 will work with Cloudian. Cloudian can also tier data to AWS S3, Glacier or another S3 compliant object store. Cloudian plans to release a "Panzura-like" global file management virtual appliance later this month that is integrated with HyperStore.
Object storage vendors are aware that being "cheap and deep" is not a formula for commercial success. Customers need solutions that run the gamut from legacy file access methods, to RESTful API access, which generally means S3 compatibility, to tighter integration with big data analytics and search. There are also differences among the object storage software vendors in terms of their architecture, management and deployment. Customers should be willing to undertake a POC and perform due diligence before selecting a vendor best suited to meeting their requirements.
Well, prior to acquiring Cleversafe, IBM did not have an OBS story to tell much less sell to customers. GPFS is not an object store, and IBM's messing around with Swift was not getting them anywhere. So IBM pulled out its checkbook and wrote a big one for Cleversafe. 10x funding is not an unreasonable payday for the people and investors at Cleversafe. You may recall that back in the day IBM paid $3B in cash for Lotus, which was one-third of their available cash. Another thing to consider about the Cleversafe acquisition is the 200+ patents obtained by Cleversafe, and the fact that the CIA is one of Cleversafe's customers and investors (through a CIA associated company). IBM has a long history as a trusted IT provider to federal agencies, so I'm sure the CIA is happy to see that IBM has their back when it comes to OBS.
Well, Mr. Ash from Cloudian is not bragging, he is just explaining the features already available in Cloudian's HyperStore software defined storage software. Every object-based storage software vendor supports at least the basic AWS S3 functions, but S3 is not their native API.
Cloudian made three "bets" with their HyperStore object-based storage software architecture. 1) Cassandra, which Cloudian extends for use as its metadata storage service, has numerous real-world use cases and is well tested and supported. Apple is reported to have 70K Cassandra servers. 2) Native support for AWS S3. AWS has the largest "ecosystem" of applications and solutions written to use S3. Cloudian's compliance with all of the S3 API functions means any S3 application or solution will work with Cloudian HyperStore. 3) Hybrid cloud storage would become a requirement for enterprise customers creating their own local storage clouds. Cloudian can tier data from HyperStore clusters to AWS S3 or Glacier, which by the way, actually have different sets of API functions.
Cloudian has been approached by other object-based storage software vendors interested in licensing Cloudian's native AWS S3 compliant service. Cloudian has chosen not to do it, because it is a key to their success in the capacity storage market.
In terms of actual customers...Cleversafe has about 150, Scality has between 50 and 100 and SwiftStack has just over 50. So for Cloudian to "bust a move" and achieve as many new customers as all three of these combined, it will need to on-board about 250 new customers. This is a significant challenge and it will be interesting to see if Cloudian can do it in conjunction with their reseller partner channel.
Well, if you listen to Mr. James Hughes whose team developed Kinetic, you will see that it eliminates storage servers and the need for a POSIX-compatible file system. The Kinetic key/value API plus Ethernet on the Kinetic HDDs is the innovation. The response from the object-based storage sofware vendors has been mixed. Mr. Joe Arnold from SwiftStack sees great promise in the Kinetic framework. Caringo has looked at it and said...meh, SWARM is better. Cloudian was of the opinion that it would need to implement a "split-brain" design in their software stack to use it. Scality was interested in seeing how it might be made to work with RING. Cleversafe (IBM) joined the Linux Foundation's Kinetic Open Storage Project, but not sure what they are actually doing with it. Not sure what Amplidata (WD) thought about it, but WD also belongs to the Kinetic Open Storage Project. HGST (WD) was also experimenting with their own Ethernet HDD that ran Debian, but this is not what Seagate is doing with Kinetic. Given that there has barely have two years of third-party development effort related to Kinetic, it seems premature to declare Kinetic a success or failure as an object-based storage architecture. Time will tell, even though Mr. Arnold is bullish on Kinetic.
Well, the lack of facts at hand makes it anyone's guess as to exactly what happened. That said, storage networks have a lot of moving parts and a failure in the networking part could easily disable access to the storage part. If the outage was due to a planned upgrade or maintenance, then there should have been a roll-back procedure in order to recover. While you cannot rule out human error, you expect that the people involved in operating and managing the storage network are adequately trained and experienced. The vendors involved along with the university will likely issue a "post mortem" when the facts surrounding the outage are understood. Then the guilty can be charged.
Well, the commercial entities "sponsoring" various open source software projects do get bought. Red Hat bought InkTank, which was the commercial sponsor established for Ceph, and Red Hat bought Qumranet, which was the commercial sponsor for KVM. Novell was once the the commercial sponsor or owner of SUSE. Yes, it is all open source and you are free to download and use it, fork it, etc. That said, commercial sponsors make money by charging for "enterprise" features and support not provided in the "subscription" release of the project. The project community generally handles the support function through forums.
Well, a nice report from Mr. Nicolas on the dynamics of the object-based storage software vendor market. The incumbent IT vendors have pulled out their checkbooks to make acquisitions in order to fill the object-based storage "holes" in their storage product lines. The remaining venture-funded object-based storage software vendors will either get bought or go public, if they can. Interesting that Mr. Nicolas did not mention that Scality is aiming for an IPO in 2017. Amplidata and Cleversafe both use only erasure coding to protect data, and they have a litigious history with each filing suit against the other. Cleversafe has on the order of 350 patents, which could have made them more attractive to IBM in addition to having the CIA as a customer. Dell has a deal with Scality, which will likely end. Dell also had a three-year deal with Caringo that ended. HP has deals with both Cleversafe and Scality, although I suspect the deal with Cleversafe will be kicked to the side. Red Hat likes to collect open source software projects, so its acquisitions of Gluster and Ceph (InkTank) are not surprising. Gluster is not really object-based storage, and Ceph doesn't really excel as an object based storage software, but it can also be used for file and block storage. Swift does have scalability issues, although the 3.0 release of SwiftStack looks like it is making some progress on the feature and functionality side. Basho's Riak and Riak CS are geared more toward the developer market as opposed to the enterprise data center storage market. Riak CS is somewhat unusual in that it stores both the object data and the metadata in Riak. Caringo is one of the older object-based storage software vendors that seems to score a steady stream of customers, most recently BT. Cloudian has focused on delivering a packaged software and appliances that can be deployed by service providers or enterprise customers in public or private storage clusters. Cloudian also uses a native S3 API, which is fully compatible with the AWS S3 API and can tier data to either AWS S3 or Glacier. The object-based storage software vendor market is undergoing change, but it is hard to see any of them becoming "losers" in the market.
Cleversafe fills an obvious hole in IBM's storage portfolio. It had no OBS software, and GPFS is not an object store. IBM was also not getting anywhere fast with Swift. Cleversafe has 300+ patents on aspects of OBS. The CIA is suspected of being a Cleversafe customer. IBM has long experience in being a loyal provider of IT services and systems to the US federal government, so the CIA can REST easy.
Scality has OEM deals with Dell and HP. If Dell acquires EMC it will inherit the obsolete Centera OBS product, which is being phased out, and the aging Atmos OBS product. Scality may get pushed aside at Dell in all the churn. It is unclear how well Scality is doing with its HP deal one year later. Scality had to make the installation of RING 5 much easier in order to avoid having to do professional service engagements to install RING for HP's channel partners.
Mr. Lecat gets a few things wrong in his comments about Cloudian. Cloudian's native API is S3...no translations are needed from S3 to some other API, which makes Cloudian fast and efficient The benefit of Cloudian using S3 as its native API means that any AWS S3 solution will work with Cloudian. So S3 your data center with Cloudian and enjoy private to public hybrid storage that supports tiering to AWS S3 or Glacier. Don't think Scality can do that.
Cloudian does not lock customers into using its branded Supermicro servers and QCT JBOD storage. Cloudian offers HyperStore Software for installation on storage servers from Dell, Lenovo, QCT and Supermicro. Cloudian also offers HyperStore Appliances with software pre-installed for quick setup and rapid deployment...something that Scality RING has never been known for.
As for Cloudian not being good at supporting legacy NFS and CIFS access methods to its OBS, expect to see an announcement regarding this real soon now.
Mr. Lecat is obviously hoping that Scality can keep its "top dog" billing in the OBS market until it can make its IPO in 2017. A lot can happen between now and then...shift happens.
Well, IBM may also have been interested in the 350 patents granted to Cleversafe. Seems like every other month Cleversafe was being awarded a patent. The United States CIA is reputed to be one of Cleversafe's big customers, and an "investment arm" of the CIA participated in one of Cleversafe's funding rounds. Cleversafe's funding rounds also have a very odd character to them in terms of the amounts and timing. Cleversafe and Amplidata share a litigious past with each company filing suits and counter suits against the other. Amplidata was purchased this year by the HGST division of Western Digital. Cleversafe, like Amplidata, only uses erasure coding to protect data. Most other object-based storage software vendors like Caringo, Cloudian, Scality and SwiftStack support both replication and erasure coding. The object-storage market has been slow and steady, which may account for the relatively small amount of M&A activity over the past several years.
Well, Cleversafe relies entirely on erasure coding to store objects and usually only addresses customers needing petabyte scale from the get-go. Scality does both replication and erasure coding, so it would appeal to customers who would not necessarily choose Cleversafe. The part that doesn't compute is HP's Proliant server gear is top shelf and not like the "industry standard" storage servers that object storage providers usually promote to keep the capital costs low(er). It would be interesting to know the cost of a fully-loaded HP Proliant SL4540 G8 or SL4545 G7 and the annual operating and support cost for either Cleversafe dsNet or Scality RING.
Well, the Nirvanix failed for 3 plausible reasons according to Simon Robinson from 451 Research. To paraphrase his analysis...1) The Nirvanix business model was too capital intensive for what the company was charging and the company eventually burned through its cash. 2) The Nirvanix "Cloud File System" software did not scale-out as well as they were expecting. This eventually became problematic because Nirvanix needed to scale-out to keep growing in order to run with the big dogs. 3) Nirvanix did not evolve beyond its initial storage service offering. This limited the extent to which customers could grow in their use of cloud infrastructure with Nirvanix.
The cautionary tale is customers need to develop an "exit strategy" or have a contingency plan ready when a Nirvanix-like implosion happens. Cloud service providers like Nirvanix are not without blame when they are "working without a safety net" in their business. Cloud service providers should have a capital reserve fund or insurance to wind down their business if it must be shuttered. The "customer-be-damned" attitude doesn't work in the cloud and it will invite government regulation if this type of behavior is repeated. I applaud the efforts being made by Aorta Cloud/Capital to keep Nirvanix operating so customers can have the opportunity to make decisions about what to do regarding their data stored at Nirvanix.
Well, Glacier-compatible cold storage is also being developed by SageCloud in Boston, MA. The SageCloud founders Jeff Flowers and David Friend are from Carbonite. They are basing their work on the facebook Open Compute Project. SageCloud completed a $10M funding round this summer bringing their total funding to $13M. The company recently signed an agreement with Avnet's Rorke Global Solutions to assemble the SageCloud hardware. SageCloud will make first customer shipments in January 2014. I received an explanation of their cold storage technology under NDA. I think what they are doing for the cold storage/archival market will be well-received in terms of cost, energy-efficiency and performance.
What actually constituted a "data center" in the 1970s? Most of what passed for "data centers" back then were IBM mainframe (System 360) time share services. Interactive computing was being offered by DEC, but you generally bought or leased DEC mini-computers and kept them on your own premises. Everything having to do with the actual computation and storage of data was generally installed in "glass rooms" that were temperature and humidity controlled but I don't think they fit the modern definition of a data center. Data storage on rotating magnetic disks or removable "disk packs" was only available for small amounts of data because it was limited in capacity and very expensive. Lots of data was stored on magnetic tapes, which were mounted and read/written when the data was needed. Human beings had to mount and dismount tapes from the tape drives. It all seems quaint by today's standards and it was probably glitchy and unreliable at times too.
Well, I agree that Intel is a much more likely suitor for Amplidata than Quantum. I also agree that Amplidata's cash burn rate could be driving an acquisition strategy by Amplidata's management. As for Cleversafe suing Amplidata, I think there have been suits and counter suits from both of them over the past few years. Amplidata has worked with Intel and Quanta using their AmpliStor software as part of "Intel's Cloud Builders Guide to Cloud Design and Deployment on Intel Platforms". It would be interesting to see if Intel open sourced AmpliStor as part of their acquisition of Amplidata, as it would provide another open source object storage software in addition to Ceph and Riak CS.
Editorial comment to my previous post...CAStor makes it possible o avoid erasure codes for objects below a certain size...
It appears Amplidata has now tweaked" their software to make it perform better in a predominantly small object storage environment where erasure codes are not a good "fit" for small objects. The ingest and retrieval of millions of small objects using erasure codes didn't work...well enough. Caringo's CAStor makes it possible not to avoid erasure codes for objects below a certain size and use replication instead. It is the combination of replication and erasure codes that makes CAStor a better "fit" for Massive Media.
A worthy post, but all governments hate the Internet for the simple reason that information can flow freely over it. In states without a tradition of individual liberty, it is easier to control aspects of Internet use. The "Great Firewall" in China prevents obtaining results on searches that contain certain words and phrases. In Cuba there is practically no Internet availability for the average citizen, although recent developments indicate the Cuban government may be relenting a bit. In the countries of the "Arab Spring" uprisings, governments under siege by their citizens were able to cut-off Internet and cell phone service for a time.
In the United States, Internet service is widely available and commerce is heavily dependent on it, so it is not acceptable to use such crude methods as blocking searches and interrupting or suspending Internet service. The NSA, which is part of the U.S. military establishment, has resorted to widespread data collection (Big Data) and analysis of both foreign and domestic Internet traffic. The data collections is indiscriminate and universal. NSA has a one-million sq. ft. facility built in a mountain in Utah just for the purpose of storing and analyzing data scooped up off the Internet or out of the air.
The mere fact that such huge volumes of data is being collected and permanently stored is sufficient evidence that the foundations of a modern police state are being established. It remains to be seen whether Americans will be able to push back against the government-backed military establishment.
I suspect the article was written to generate more heat than light on the subject, but storage vendors do have an industry association in SNIA and SNIA is driving the CDMI standard into the storage market at a pretty good pace when it comes to standards. That said, AWS S3 is the de facto standard API for object storage and CDMI will be able to work in conjunction with it.. I spent the better part of last year evaluating a handful of vendors who provide object storage software to enterprise customers and partners to build private and public storage clusters. Each of the vendors is venture backed and their founders all have significant history in dealing with data storage requirements that were not solvable with the traditional file and block storage technology that we've had around for decades. The incumbent storage vendors have taken out their checkbooks and bought the storage technology they think will allow them to participate at-scale in the market. Whether they can be price competitive remains to be seen given their past history of bundling their storage software with their proprietary hardware. Because the object storage market is relatively new, you can expect to see some participants get acquired and some incumbents to change course. The recent Dell announcement about ending their OEM deal with Caringo has more to do with the internal players at Dell than it does with the quality of CAStor. All of this is par for the course and it may be quite a few more years before the object storage market and players coalesce. Remember that there were once over 200 vendors engaged in the manufacture of hard disk drives. After 30+ years we have 4 of them left. The same thing is going to happen in the SSD market too. There is a market for object storage and it is being met by offerings from relative new companies as well as the incumbents. It is not a matter of a technology in search of problem.
The Dell partnership with Caringo was a good move back in the day. The problem with the DX storage server line was you could not put enough disk drives in them if you were actually building out an object storage cluster. Right not you can put 72, 3TB 3.5-inch SATA disk drives in a 4u SGI MIS. That's 216TB per server and 2PB per 40u cabinet with room for a top of the rack 10GbE switch. SGI currently has a partnership with Scality, which is probably somewhat similar to what Caringo had with Dell. BackBlaze has opened up their design for a 4u storage server that holds 45, 4TB 3.5-inch disk drives and Supermicro has a 4u storage server that will hold 36, 3TB 3.5-inch disk drives. So you can take your pick of 4u storage servers that will hold 216TB or 180TB of 108TB each. If Dell want to be a player in this market offering "industry standard x86 hardware, then this is the ballpark for object storage servers today. When HAMR disk drives arrive starting in another year or so, it will be a whole new ballgame as capacities with start at 6TB per 3.5-inch drive and probably peak out at 20TB to 30TB per drive over the next 10 years.