The silver lining is that those who were laid off might be better able to apply their talents elsewhere, and might ultimately be paid more for it. There was a time when Veritas did innovative things. That time is long gone. For the last $forever they've seemed intent on *strangling* innovation. When they've wrung the last dollar out of products that were innovative twenty years ago, they'll just fold the whole thing up and they won't be missed.
62 posts • joined 9 Oct 2009
So Facebook takes this suggestion and hires a bunch of editors, who at some point inevitably turn out to all be Trumpians who brand anything they don't like as fake news. Or Bernians who do likewise from a different direction. Then the same people continue to lambaste them for doing The Wrong Thing because it was *never* about doing what's right or protecting democracy or anything like that. A lot of it is competitors in the information business doing what competitors do, along with a big dose of Tall Poppy Syndrome.
Dealing with fake news doesn't mean giving any one group editorial control - not Facebook itself, not El Reg, sure as hell not any government. It means allowing multiple rating or collaborative-filtering services to flourish, and giving users a *choice* of which ones to trust, much as we do for spam filters and ad blockers. It's a market, not a planned information economy. Facebook's role is to help users find anti-fake-news filters *they* trust, and to honor the results as they're displaying an individual user's feed.
Re: I've never understood
There are basically two reasons. One is that competition is a strong motivator. For a lot of people, including me, leaderboards can motivate people to go out more, or to push faster/further than they might have otherwise. Another is helping to cheer each other on. I have three friends on MapMyRun, I know that the encouragement I get from them is helpful when I'm not doing so well and I certainly hope it works the other way too.
That said, there are good and bad ways to share this data. For example, on MMR those three friends are the only ones who get to see exactly where I've gone, or whether I've gone at all on runs that don't earn me a place on a leaderboard. All anyone else sees is first name, last initial, time on that segment, and date. I *could* open up full sharing, but it's not a default. No heatmaps or anything like that, though I've kind of wished for that as a way to help people find routes worth trying. Overall, I'm pretty comfortable with MMR's approach. If I used Strava, I think I'd be a bit less comfortable.
As a Gluster developer, I don't mind the omission this time (though other times I've found it unjustifiable and irksome). This article seems to be about new or at least significantly changed companies. Neither Gluster nor Ceph fits that template. There's no new funding, new products, new partnerships, or C-suite drama to generate headlines. We're kind of boring TBH.
Google's all about network neutrality when it comes to routing, but I guess not when it comes to names. "You must share but I must own" is a pretty crappy attitude.
Mage, you're either woefully misinformed yourself, or deliberately misinforming others. Google has brought in or copied everything? Besides the obvious exceptions of the PageRank concept and map/reduce, here are plenty more exceptions.
Facebook hasn't innovated technically? More exceptions.
Never heard of Spanner or Borg/Kubernetes or Cassandra or HHVM or Open Compute? That's just ignorance. Look, I'm sensitive to the issues of privacy and market dominance and so on, but the specific claim here is that these two companies are bad for *innovation* and that's clearly false. Name a company you like better. Let's see if they contribute as much to innovation. Highly unlikely.
There are plenty of reasons to criticize Google and Facebook, but lack of innovation is not one of them. A large part of the reason is this thing we call open source, to which both contribute a great deal. The author would do well to read about open source, specifically how it prevents the kind of enclosure and stagnation he's worried about. The fact that one site can copy another's superficial features is a *good* thing, because the alternative is exactly the kind of intellectual-property regime that leads to the worst kinds of monopoly. Would the world be better if Amazon (should be the true target of his screed) had prevailed on the one-click patent, or Apple on all the "look and feel" stuff? Hardly.
It's refreshing to see a little skepticism about such vague claims by storage upstarts. Without an actual apples-to-apples comparison - not their NVMe gear vs. someone else's spinning disks as usually happens - it's impossible to say whether they have anything or not. Maybe they really have come up with some great innovation. If so, I'd wonder why they're not bragging about the patents they've already filed. Absent that, this looks like a pitch for investors/acquirers rather than customers. If they're still standing and still independent a year from now, maybe we can have a discussion about the technology then.
Isn't this true of other groups as well? We all fight over the issues with which we're most intimately and constantly familiar. Emacs vs. vi is our version of angels dancing on the head of a pin. Another metaphor is the infamous bikeshed. Nobody wants to argue about the design of a nuclear power plant, because that's complicated and hard and requires a lot of knowledge, but everyone has an opinion on what color the bikeshed at that plant should be. Perhaps the general tendencies of programmers - highly focused, introverted, a bit brittle - make this somewhat worse than elsewhere, but mostly it just seems like human nature.
Third Mover Advantage
Coho ran into an all-too-common problem for storage startups: storage customers are hard to evangelize. They have pressing, immediate problems. They want solutions to those problems, and ideally solutions that fit into their current paradigm. Getting them to look forward to the *next* set of problems is really really hard. Coho had looked ahead, seen a problem on the horizon, developed some interesting technology to address it ... and then found themselves too far ahead of the customers to get any revenue out of all that. Like other technologies (*ahem* CDP) it will be not the originators but some late-arriving copycats who hit the right market window to benefit. With luck, some of the people who actually had the vision and made the efforts will get to ride the gravy train a few years from now, but in my experience that happens all too rarely.
Is it just me, or does the idea of running an internet-accessible memcached server already seem insane?
Own your availability, own your security
I've seen way too many cases where the preinstalled firewall crap at a cloud provider interfered with the operation of the distributed systems I was installing. Often the tools and documentation available to resolve the issue were miserable too. I did not appreciate it. I'm perfectly capable of locking down my own system, without making it unusable, all by myself. IMO it's perfectly reasonable for a provider to avoid the complexity and cost and aggravation associated with trying to do what any competent Linux administrator can and should do themselves.
At a previous job we had a similar - but not identical - problem with a machine in Boulder. In our case there was one more step. Because of the thinner air, we got less cooling. The warmer temps made the PSU less efficient, causing brownouts which manifested as transient errors on our internal communications links. The fix turned out to be a slight adjustment the the ratio between temperature and fan speed. I was the guy on-site, but kudos to the hardware folks back on the east coast for figuring it out.
Are they also asking for an investigation into White House staffers using Confide? Of course not, because this isn't about infosec or policy. It's purely a matter of attacking the other team and defending your own.
So all of those accusations against Hillary, or the claim that there were millions of illegal immigrants voting, should also be ignored until proven, right? Ditto with your accusation of lying. But you're missing one important thing: some information is dangerous to disclose. The evidence has been given to those whose need to know exceeded the risk of that disclosure, which does not include you. It takes a tremendous ego for someone to believe they are the sole arbiter of truth, and that they personally must be convinced of a statement's truth before others are allowed to consider it. Nobody's being thrown in jail based on rumor. It's OK for people to claim and believe what a preponderance of evidence - both public and vetted but not disclosed by our elected representatives - suggests.
As far as I can tell, this is just EC2 with features removed to enable a simpler pricing model. The fact that many of these features become available again through VPC peering suggests that it's a separate (someone else's?) data center. But the price isn't really going to destroy Digital Ocean etc. Looking at the 2GB level, which is the lowest they all have in common and is what really constitutes a starter system:
* Digital Ocean - $20/month for two cores and 40GB SSD
* Linode - $20/month for one core and 24GB SSD
* Vultr - $20/month for two cores and 45GB SSD
* Lightsail - $20/month for one core and 40GB SSD
Lightsail is below median for cores, at median for storage, all for exactly the same price. Without benchmarks - especially storage benchmarks which IMX have shown a 2-3x difference between providers or even instances within one provider - it's hard to know which is really the better deal. The real take-away here seems to be that Amazon was feeling pressure at the low end.
The overcommit at issue on a storage server is probably not VM overcommit (or oversubscription) but process-memory overcommit. If you allow memory overcommit what you're saying is that the system can allocate more virtual pages to processes than it can actually back up with physical memory plus swap. It's kind of like fractional-reserve banking, and we've all seen what happens when that goes too far. Everythingl works great until there's a "run on the bank" and every process actually tries to touch the pages allocated to it. Since it's not actually possible to satisfy all of those requests, the kernel picks a victim, kills it, and reaps its pages to pay other debts. It's just as evil as it sounds. It works to a degree and/or in some cases, but IMO it's an irresponsible default made worse by the fact that the Linux implementation has always tended to make the absolute worst choices of which process to scavenge.
In a virtual environment, things get even more interesting. You can allow memory overcommit either within VMs or on the host, or both, and that's all orthogonal to how you size your VMs. Where most people get in trouble is that they oversubscribe/overcommit at multiple levels. Each ratio might seem fine in isolation, but the sum adds up to disaster. The OOM killer within a VM might take down a process, the OOM killer within the host might take down a VM, you can get page storms either within a VM or on the host, etc. It's much safer to overcommit in only one or two places, and then only modestly, but those aren't the defaults.
Way to play the false-dichotomy and appeal-to-authority cards, Nate. I've been a Linux user just as long as you claim, and a UNIX user for a decade before that. There are other options besides a crash or hang. I even mentioned one already: don't overcommit. If there's no swap (really paging space BTW but I don't expect you to know the difference since you don't even seem to realize that allowing overcommit increases the page/swap pressure you so abhor) then memory allocation fails. The "victim" is statistically likely to be the same process that's hogging memory, to a far greater degree of accuracy than any OOM-killer heuristic Linux has ever implemented. If you want to avoid paging, limit your applications' memory usage and don't run them where the sum exceeds memory by more than a tiny amount (to absorb some of the random fluctuations, not steady-state usage). If you fail to follow that rule, adding overcommit will just push the problem around but not solve it.
There are cases where overcommit makes sense. At my last job we had users who'd run various scientific applications that would allocate huge sparse arrays. Since these arrays were guaranteed to be very thinly populated, overcommit was safe and useful. However, for general-purpose workloads overcommit makes a lot less sense. For the semi-embedded use case of a storage server, which is most relevant to this discussion, it makes absolutely no sense at all. Unconstrained memory use is the bane of predictable performance. Turning performance jitter into something that's easier to recognize and address is actually pretty desirable in that environment, and that's what disabling overcommit will do.
I feel bad for everyone involved. For the customer, the reasons are obvious. For Maxta, this is all too reminiscent of experiences I had working at small companies, and especially in storage. One of the main culprits seems to have been bad controller firmware. Even companies that control the hardware sometimes have trouble with that one. When you ship software to run on hardware the customer controls, the situation becomes impossible. The second issue sounds like the good old Linux "OOM KIller" which was an incredibly stupid idea from the day it was conceived. At both of my last two startups, we ended up having to disable memory overcommit because of the havoc that would result when the OOM Killer started running around like a deranged madman shooting random processes in the head. To be sure, Maxta probably could have done a better job controlling/minimizing resource use, but I know that's a difficult beast to fight so I'll cut them some slack. Put both of these problems in a context of confused business relationships and expectations, and it's no surprise that a disaster ensued. The lesson I take away from this is that vendors need to keep the list of Things To Avoid complete and up to date, while customers need to be clear and open about what they're doing to make sure they don't fall afoul of that list. Amateurs and secret-keepers have no place in production storage deployments.
NT, or XP?
I think the proper analogy is XP, not NT. NT was a new architecture, separate from the legacy 3.x/95/98 codebase. XP was the reunification of these divergent streams. Similarly, Android represented a bit of an architectural departure with its unique JVM-based userspace. Andromeda will represent the reunification of that with the more traditional architecture of ChromeOS (so traditional that I'm running full Ubuntu in another window on this Chromebook right now).
Still trying to get a handle on what Andromeda will mean for us Chromebook users, BTW. *That* would be an interesting story to delve into.
"The Realm Platform works with Java, Objective-C, and Swift"
So of course the image shows PHP. Yeah, I know, it doesn't matter and only a geek-pedant would notice or comment on it. Still, perhaps not the best design choice.
Re: I never quite got containers...
Containers are pretty useful, but the idea that they should all be stateless has always been STUPID. Any non-trivial application has state that has to be stored somewhere. Making it "somebody else's problem" only creates a new problem of how to coordinate between the containers and whatever kind of persistent storage you're using. If one provisioning system with one view is responsible for both, subject to the constraint that the actual disks etc. physically exist in one place, then it actually does simplify quite a bit of code.
It's misleading to say Red Hat Gluster Storage will be available this summer, or to imply that it's just now competing with Portworx et al. RHGS has been available for years, since before some of those others issued their first press release - let alone wrote their first line of code. It's just the new version that's coming.
So you finally admit that there's such a thing as open-source storage, but only to shoot it down with more mentions of proprietary competitors. :sigh:
Re: Regulation is sensible the article is not
Why do you insist on comparing vaping only to cigarette smoking? That's pure cherry-picking. Nobody has disputed that the all-vaping world is better than the all-cigarette world, but neither is the world we actually live in. Vaping needs to be considered *on its own merits* and not just in comparison to something we all know is bad. Doing X and vaping carries some risks that doing X alone does not, for all X. Those risks, which are and are likely to remain better known/understood or controlled by vendors than by consumers, are a legitimate subject of legal/regulatory interest. If you think these particular regulations are too draconian, the constructive response would be to suggest alternatives. Trying to dismiss all possible regulation makes you seem like an ideologue, and trying to suggest that vaping is a net public-health positive makes you look delusional.
I'm not going to disagree with you, there. Centralized trust doesn't work any better than centralized anything else. The only thing I'll say is that the browser makers have made the whole thing even less secure than the design allows by shipping certs for all these shady companies - many of which are clearly just arms of equally shady governments in various forsaken parts of the world. A chain of trust can still be strong if the links are all strong. It's a problem that this becomes hard to guarantee as the chains get longer, but it's also a problem that the browser vendors *knowingly* include weak links in the bags they provide.
Thanks for clarifying that.
The one nugget of truth in the article is that the list of CAs built in to browsers etc. is ridiculous. I had occasion to look recently. I'll bet at least half of those organizations are corrupt or compromised enough that I wouldn't even trust them to hold my hat - let alone information I actually value. Anybody who wants a signing cert for MITM can surely get one. That really does cast doubt on whether HTTPS is really doing us all that much good, but it's important to understand exactly where the weak link in that chain is.
Looks like Dunning/Kruger to me
As with many things, the first level is easy but then things get much harder. Can I build a simple database? Sure I can. Can I build a fully SQL-compliant database with a sophisticated query planner and good benchmark numbers? Not without some help. Can I build an interpreter for a simple language? No problem. Can I build a 99.9% gcc-compatible compiler that spits out correct high-performing code for dozens of CPU architectures? Um, no. Similarly, building a very simple storage system is within reach for a lot of people and is a great learning exercise. Then you add replication/failover, try to make it perform decently, test against a realistic variety of hardware and failure conditions, make the whole thing maintainable by someone besides yourself . . . this is still a simple system, no laundry list of features to match (let alone differentiate from) competitors, but it's a lot harder than a "one time slowly along the happy path" hobby project.
I'm not saying that the storage vendors deserve every dollar they charge. I'm pretty involved with changing those economics, because the EMCs and the NetApps of the world have been gouging too much for too long. What I'm saying is that "build it yourself" is a bit of an illusion except at the very smallest of scales and most modest of expectations. "Build it with others" is a better answer. Everyone contributes, everyone gets to benefit. If you really want to help speed those dinosaurs toward their extinction, there are any number of open-source projects that are already engaged in doing just that and could benefit from your help.
Am I crazy for thinking that "Motion Picture Ass" had nothing to do with saving headline space?
Re: Nothing new @pPPPP re MP3 players.
Once you have files, objects are easy. The difficulty lies in going the other way.
Not so fast ;)
Enrico, the problem with the idea of high-performance object storage is that the S3-style APIs are not well suited to it. Whole-object GET and PUT are insufficient. Most have added reading from the middle of an object; writing likewise has been claimed/promised for a long time, but is still not something developers can count on being able to do. The stateless HTTP protocol is also inherently less efficient than what you get with file descriptors and a better pipelining model. Frankly, a lot of the object-store implementations aren't up for a performance game either. The most charitable way to put it is that the developers were prioritizing other features such as storage efficiency. I'll be a bit less charitable and say the whole reason most of them got into object stores was because they're easy, so they wrote their code with inefficient algorithms and languages/frameworks. That lets them get to market earlier, but the downside is darn-near-unfixable performance issues. The main exception is Ceph's RADOS, which has an API more like NASD/T10 than S3 and which was designed from day one to support upper-layer protocols that demand higher performance.
Throwing flash at an object store won't let it catch up with block or file storage that's also flash based. It might be higher performance than it is now, but it will still be slower than contemporaries. It's going to be really hard for anyone in that mire to get beyond the tertiary role.
"If I have more resources than required"
Might as well stop there. That never happens for long. Where there's capability to spare, new workloads will be added until that's no longer the case. It happens with CPU, it happens with memory, and it happens with storage. Always has and always will. The real question is how to maximize the value of the IOPS you're providing when you're providing as many as you can. That means letting higher-value IOPS (e.g. for higher-value apps or tenants) take priority over lower-value IOPS, and that's QoS.
Besides the fact that what they're doing is no different than what GlusterFS (which I work on) and Ceph have done for years, they start off with two lies.
(1) Their FAQ claims that GlusterFS uses a centralized server, which is not true.
(2) They claim to be open-source, but when you follow the link a big fat "coming soon" is all you'll get.
Outfits like this come along every damn month. And they disappear every month too, when they find out that gaining and retaining users is harder than getting a few mentions in the trade press. There's no reason so far to suspect this one will rise above that vile crowd.
The application containers themselves might be stateless, but they almost always need access to shared persistent data somewhere - web pages, customer records, calculation inputs and outputs. That can be a whole separate island of specialized hardware or bare-metal servers, but why not use the storage already within the container infrastructure? That gives your storage servers the same benefits as your application containers, and allows seamless sharing/balancing of resources between them.
BTW, Gluster (on which I work) has been able to do this since approximately forever, and we have many enterprise customers using this approach. Some of them have even presented publicly about their experience. Nice to see Portworx following our lead.
Storage has always been a hard place to make a living
Especially for startups. It's one of the first places that enterprises look to cut costs, and one of the last places they're willing to experiment. And it has become a crowded space. The folks at Coho are great, but I could say the same about a dozen other startups of the same vintage. They can't all succeed. In a way, this is a side effect of lowering the barrier to entry. Now that scale-out software on top of commodity hardware (even if it has a fancy faceplate) is more competitive with specialized hardware, it seems like everybody and their brother has a storage startup with a new take on where the "real" storage problems are and how to solve them. Some of those ideas are truly new, and truly great. Some aren't. The problem is that it's hard to tell which is which, so when the lifeboat's too crowded and companies start getting thrown overboard it's not always the ones who should have been. Sadly, technical merit and business value don't usually count as much as cozy relationships with investors, analysts, journalists, and (just once in a while) "whale" customers.
Data ONFIRE? Heh. Good one.
Why do these articles only ever seem to compare against *proprietary* solutions? Another basis of comparison for semi-open-source RozoFS would be truly-open-source Gluster (on which I work) or truly-open-source Ceph, both of which already have erasure coding too. Based on experience with that, I'd say *it doesn't matter* which erasure-coding algorithm involves more addition or multiplication because those calculations are only a minor factor in overall performance. The amount of data that must be transferred, either during normal I/O or during repair, matters far more. The coordination overhead matters even more than that. If you have two clients trying to write overlapping blocks, and they don't coordinate properly, then half of the servers get erasure-coded pieces of one write and half get erasure-coded pieces of the other. This isn't even "last writer wins"; anyone who tries to read that data subsequently gets *garbage* back. The #1 determinant of performance in such systems is how they avoid this issue for every kind of operation (including both data and metadata with all of the atomicity/durability guarantees that must be met to keep users from screaming).
If the Rozo folks want to brag about their erasure-coding efficiency, let's see some actual performance data. While we're at it, let's talk about the scale at which things have really been tested. Anybody can claim hundreds of nodes and multiple exabytes but AFAIK no project in this space has ever successfully run at that scale on the first try. They *always* run into new failure modes and performance anomalies that never appeared at smaller scale and that often require substantial new subsystems to address. Then they find out that customers at this scale are going to want tons of other features as well. Some of these are still only on Rozo's roadmap, after having been shipped years ago by competitors. Others, especially related to multi-tenancy, are still missing entirely.
I think what Rozo is doing is very cool, and I wish them all the success in the world, but let's not lose sight of the fact that there's a *long* row to hoe before even the best ideas turn into a competitive storage solution. They sound a lot like the Ceph folks did *five years ago*, but Ceph (with far more resources at hand) is just now making the transition from bleeding-edge to enterprise-ready. It's not because they lack talent, I can assure you of that. It's just that these problems are *hard*, and solving them takes a lot longer than Evenou and Courtoy seem to think. I'd love to hear from the RozoFS developers about when *they* think RozoFS will be competitive with what's already out there.
Let's not overgeneralize. *This time* he didn't name names. On the other hand, he still did make some pretty strong inferences about "whoever" wrote the code, and "whoever" isn't hard to discover. That's well beyond just criticizing the code.
On the other other hand, I've been on too many projects that *didn't* lay down the law this firmly. Developers are a sneaky lot, and they tend to have their own agendas. They'll keep sneaking in code that they know is crap, if it lets them mark more of their personal tasks complete. If nobody is watching, or nobody responds strongly enough to put the fear of God into them, the result is a codebase that slowly rots into irrelevance. I do think Linus and (even more so) certain other Linux kernel developers behave in some pretty toxic ways sometimes, but as we try to improve that situation we still need to remember that bringing the hammer down once in a while is strictly necessary to maintain any kind of quality. It's all in how it's done, not whether it should be done at all.
A welcome development
This particular case involves siblings (as of last week), but I suspect we'll be seeing a lot more of this kind of thing even among non-siblings - yes, even among current rivals - in the next few years. Among other things, it means folks like Isilon will be forced to compete on the basis of software quality instead of relying on custom-tuned hardware to give them an edge in performance comparisons. Bring it.
(Disclaimer: I'm a Gluster developer)
"Amazon S3 is designed for 99.999999999% durability" (i.e. every put has 11 9s durability)
That's really about availability. It says nothing at all about when data is guaranteed to hit stable storage. You do know what "durability" means in data storage, don't you?
"few year old Beta level Ceph benchmarks are not a good measure,"
Ah yes, that's no true Scotsman all right. You asked for citations, I provided them, now you demand different ones. At least those actually compared Ceph to Gluster, on the same hardware. The document you cite only compares Ceph to itself. Why would you assume Gluster has been standing still, and wouldn't also perform better? That's convenient, I suppose, but hardly realistic. Making comparisons across disparate versions and disparate hardware tells us absolutely nothing.
"as the Gluster architect you are not clean from bias"
And I disclosed that association right at the beginning, because I believe in being honest with people. You're still moving the goalposts, citing "evidence" that's unrelated to the actual topic at hand, ducking the issue of how NFS overhead *plus* impedance-mismatch overhead can be less than NFS overhead alone. You haven't even begun to address the problems inherent in trying to provide true file system semantics on top of a system that has only GET and PUT, different metadata and permissions models, etc. This isn't personal, but misleading claims often lead to wasting a lot of people's time if they're not challenged. If you think object-store based file systems are such a great idea then you need to grapple with the issues and provide some facts instead of just slinging mud.
Re: Not so fast
"Object even S3 provides Atomicity & Durability as base attributes"
Simply untrue. You were talking about making the file store sync *on every write*. Object stores provide no guarantees on every write, because they don't even have a concept of every write. That's the flip side of any API based on PUT instead of OPEN+WRITE. At the very worst, an apples to apples comparison would require only an fsync *per file*, and even that would be requiring more of the file store than the object store. Can you actually cite the API description or SLA for any S3-like object store that makes *any claims at all* about immediate durability at the end of a PUT? Amazon's certainly don't, and that's the API that most others in this category implement.
"Would be happy if you can point me to a benchmark to back your thesis which can shows Gluster significantly knocks out Ceph"
"not fare to pick on a cloud archiving product like S3 to make perf claims."
Except that such "archiving products" are the subject of the article we're discussing. What's unfair is comparing a file system to an object store alone, on a clearly object-favoring workload, when the subject is file systems *layered on top of* object stores. All of those protocol-level pathologies you mention for NFS will still exist for an NFS server layered on top of an object store, *plus* all of the inefficiencies resulting from the impedance mismatch between the two APIs. If the client does an OPEN + STAT + many small WRITEs, the server has to do an OPEN + STAT + many small WRITEs. The question is not how a file system implemented on top of an object store performs when it has freedom to collapse those, because it doesn't. The question is how it performs when it's executes each of those individual operations according to applicable standards and user expectations, which set definite requirements for things like durability.
The only "religion" here is faith in the assumptions that support your startup's business model. It's not my fault if those assumptions run contrary to fact. I'm just pointing out that they do.
Re: Not so fast
"if you disable the client cache or sync() on every IO to be on par with object atomicity/durability (required for micro-services)"
S3-style object storres make *no* guarantee about consistency or durability. There's a word for the kind of tuning you speak of, hamstringing one side to meet a requirement for which the other is held exempt. It's called cheating. It's a way of *massively* skewing the results to favor one side, and it's why methodological disclosure is so important. Please compare apples to apples, then get back to us.
Re: Not so fast
The issue of implementing file semantics on top of weak (S3-style) object semantics is not just an implementation choice. It introduces an architectural need for an extra level of coordination, which any implementation will have to address. There are richer object APIs that offer better performance (Ceph's RADOS is one), but that's very much not what most object-store advocates (like Enrico and now Trevor) are peddling.
As for Ceph being faster than Gluster, I'll take that with a *big* grain of salt. I've seen many such comparisons, and even made a few myself. Anyone can cherry-pick a configuration or workload that favors one over the other. That's why disclosing such things is important. I have literally *never* seen such a comparison that made such disclosures and didn't contain blatant methodological flaws, and which favored Ceph. Not even from the Ceph folks themselves. Maybe if someone who worked for one of the RDMA-hardware companies (and who should have disclosed that fact before making claims) had done special tuning, and was comparing RADOS to Gluster+loopback, they could come up with such a result, but it wouldn't mean anything. Without details, I'm inclined to call BS on that one.
Lastly, yes, one get can be more efficient than lookup, open (note the order), read, etc. That's great for file-at-a-time access patterns. A few people care about those. On the other hand, that difference pales in comparison to the difference between writing a single byte in the middle of a multi-gigabyte file vs. having to do a get/modify/put on the whole thing. Chunk up the files into multiple objects and you're back at multiple requests for the whole-file case, plus a metadata-maintenance problem that starts to look like the one file systems already solve.
Layering files on top of semantically-poor objects always leads to problems. Solving those problems either destroys any potential performance or scalability advantages you might have started with. That's why most such systems have gateway SPOFs and bottlenecks. In fact they look a lot like distributed file systems fifteen years ago, before we figured out how to solve exactly those problems in a reasonably elegant and efficient way. Those who do not know the lessons of history, etc.
Not so fast
The elephant is the room is performance. Object storage pushers try very hard to avoid even measuring it (as shown by the near total lack of benchmarks). If you layer a file system on top, it gets even worse. Part of that's de to the overhead of pushing your bits through an HTTP-based protocol, losing and having to recreate half of your state at every request. Even more comes from the extra work you have to do to implement stronger file system durability/consistency semantics on top of weaker object store semantics. Of course, you can always cheat by not actually meeting all standards or expectations applicable to file systems, and most object store pushers do, but it still puts them in a poor position relative to systems that implement those semantics and protocols natively. Object stores aren't going to displace NAS until they can at least get into the same ballpark on performance, and I'm not sure that will *ever* happen.
Disclaimer: I'm a Gluster developer. We took the saner approach of implementing files natively and objects on top of that.
Re: a couple easy predictions
"Cisco will make an acquisition to get into data storage."
I'd be very surprised if they only made one. They'll probably mess up a couple before they get one to drive any real revenue.
Interesting piece, Chris. If you don't mind, I'll try to add on a bit based on my perspective as a developer in this area.
Traditional big-box on-premise storage vendors also face another pair of closely related threats: open source and roll-your-own. The relationship between something like Isilon and something like Gluster (which I work on) is obvious, so I won't dwell on it. The relationship between something like Isilon and something like AWS is also obvious: more AWS usage means less Isilon sales. The relationship between Gluster and AWS, or any of several similar things on either side, is more nuanced. Sometimes people abandon their own open-source scale-out storage in favor of AWS services. Sometimes they deploy that same software within EC2. It's both a threat *and* an opportunity.
That brings us to roll-your-own. If you were to look under the covers at Amazon's storage offerings, I'm sure they'd look an awful lot like what's out there in open source. Ditto for Google. Ditto for Facebook. And Twitter, and LinkedIn, and so on. The fact is that the techniques for doing a lot of this are now pretty well known. Many of those techniques were developed are refined at the aforementioned companies, each of which has rolled their own not once but several times to address various needs and tradeoffs. I've seen a public presentation from GoDaddy - not generally regarded as a company in the vanguard of storage research - about their own home-grown object store. I know of many more that I can't talk about. Perhaps the biggest threat to both traditional storage vendors and someone like me (or my employer) is not some one new product or project but the general idea that scale-out storage software can be assembled rather than developed. That doesn't mean there'll be no place for people who know this stuff and can assemble those parts into a smoothly functioning stack, but we'll be providing less of a product and more of a service. As in so many other areas, increasing levels of automation might put customization in the hands of more than the elite.
"Good morning, madam. What kind of storage system would you like me to build for you today?"
"We're committed to working with 'independent' third parties who will accept (explicit or covert) remuneration to run whichever benchmarks we want however we want them to ensure that our products prevail in 'objective' tests."
I've been in the storage game a while. I have (to my shame) worked at companies where I got to see just how 'independent' most test labs and analysts are. Good will and integrity didn't pay for those Porsches I saw in the parking lot, folks. This is just a new player, not a new game. I can't help but wonder whether some of the anger is because this new player is overdoing it so much that they've brought unwelcome attention to everyone else hiding under that same rock.
Yes, marketing terminology is dumb ... and that's all "server SAN" is. "Lash together storage from multiple servers and present it to the cluster" is a concept that existed for quite a while. Why claim victory for "server SAN" instead of the broader category, except as a marketing move?
Server SANs aren't going to be the answer unless/until they deal with the issue of server-resident storage being lost on server failure. As soon as you start replicating to avoid that, you're in the same territory as the existing scale-out and "hyper-converged" vendors which can implement the exact same data flow to/from the exact same devices. (Disclaimer: I work on GlusterFS, which is in this category.) If all you need is the speed of local storage without availability, you don't need a server SAN; you just need plain old local storage managed however you see fit. If all you need is availability without the speed, you're back to traditional SAN or NAS. The whole point is that sometimes people need both, and server SANs are hardly alone at that intersection. In fact they're the new arrivals struggling to piece together a real story.
TBH, I think "server SAN" is just a marketing term for something that was already possible (and often done) technically. Maybe that marketing allows the virt team to take ownership instead of working with people who actually understand storage, but I guarantee that will end in tears when data gets lost. Server SANs are the "peace in our time" of the storage-infrastructure wars.
Re: unlawful testing
Destructive testing can be forbidden by a lease/loan contract, as can merely opening the case. Reverse engineering can be forbidden by a purchase contract as well, as can resale to a specific third party (or at all). If a front company acquired a unit, they might well have violated their contract with Pure by allowing EMC to put that unit through the wringer. I'm not saying that's what happened, but "unlawful testing" isn't as absurd as it sounds.