Yes, marketing terminology is dumb ... and that's all "server SAN" is. "Lash together storage from multiple servers and present it to the cluster" is a concept that existed for quite a while. Why claim victory for "server SAN" instead of the broader category, except as a marketing move?
15 posts • joined 9 Oct 2009
Server SANs aren't going to be the answer unless/until they deal with the issue of server-resident storage being lost on server failure. As soon as you start replicating to avoid that, you're in the same territory as the existing scale-out and "hyper-converged" vendors which can implement the exact same data flow to/from the exact same devices. (Disclaimer: I work on GlusterFS, which is in this category.) If all you need is the speed of local storage without availability, you don't need a server SAN; you just need plain old local storage managed however you see fit. If all you need is availability without the speed, you're back to traditional SAN or NAS. The whole point is that sometimes people need both, and server SANs are hardly alone at that intersection. In fact they're the new arrivals struggling to piece together a real story.
TBH, I think "server SAN" is just a marketing term for something that was already possible (and often done) technically. Maybe that marketing allows the virt team to take ownership instead of working with people who actually understand storage, but I guarantee that will end in tears when data gets lost. Server SANs are the "peace in our time" of the storage-infrastructure wars.
Re: unlawful testing
Destructive testing can be forbidden by a lease/loan contract, as can merely opening the case. Reverse engineering can be forbidden by a purchase contract as well, as can resale to a specific third party (or at all). If a front company acquired a unit, they might well have violated their contract with Pure by allowing EMC to put that unit through the wringer. I'm not saying that's what happened, but "unlawful testing" isn't as absurd as it sounds.
If you think deduplication is a no-brainer, you've just never tried to implement it. Like Fat Data itself, it's a tool that can be used well (reducing storage cost) or poorly (killing system performance), and users deserve to be educated about the difference.
Re: Yes, package up other people's efforts made for free
I'm not going to get into the general philosophy or practice of open-source business models, but I will point out that "other people's efforts" doesn't really apply here. All of the projects that have any chance of meeting Chris's description are funded by someone. Red Hat is spending millions on GlusterFS development. Ceph has Inktank, Lustre has Intel (plus DDN and Xyratex), OrangeFS has Omnibond, etc. Nexenta might be the exception, as most of what they're selling is code developed at Sun, but at least it seems like Oracle isn't pursuing that market themselves. No authors are getting ripped off here.
Do you know who most of those projects *are* taking advantage of? The US taxpayer, whose money has been used to provide development resources and/or publicity for all of Ceph/Lustre/OrangeFS. Without that unwitting and unwilling support, none of those projects would be where they are now. I for one am glad that they are, even though I compete with them, and I consider it a good use of government research dollars, but someone with more of a small-government attitude than I have might find cause for complaint there. The transition from public sector to private is tricky, and one could well argue that too much government money has gone into Lustre pockets particularly.
Re: Gluster is not bad but...
There are many better places to discuss that, John - ideally a bug report, but also the mailing list, IRC, etc. The point *here* is that, even if some people misuse it or even if it actually is technically deficient in some way, GlusterFS has proven useful enough to enough people that it belongs in this conversation. It's not like other storage products don't have bugs and missing features too, and people who might say those preclude serious consideration. How does single-digit IOPS sound to you? Or corrupting data? I've hit both of those in other projects, without even trying, but I know those other projects can fix their bugs just as we can fix ours. The question is not which project *deserves* to become the Red Hat of open-source storage based on its current state (which we can discuss elsewhere), but which *is likely to* as it progresses over the next few years, and in that context it seems remiss not to mention Red Hat themselves.
I used to joke about this with AB Periasamy, founder of Gluster. He was using the line about Gluster (the company) becoming the "Red Hat of storage". I disagreed, saying that Red Hat should be the Red Hat of storage. Turns out we were both kind of right. ;)
But seriously, folks, it is kind of weird that you got through this article without even mentioning GlusterFS a.k.a. Red Hat Storage. Whatever you might think of our ability to "cause the major vendors a headache" that's clearly the intent and there's a lot of resources behind it. You even mention the company, but not the product. If I were only a tiny bit more cynical, I might think it was a deliberate snub posted for the sole purpose of giving a rival more exposure.
Disclaimer: in case it's not clear from the context, I'm a GlusterFS developer.
Don't replace the king, replace monarchy
Completely agree, Matt. Fragmentation might have its drawbacks, but diversity - the other side of the same coin - is absolutely essential during the disruptive phase. Just yesterday, I saw yet another post about how a particular technology area (in this case storage) lacked a dominant open-source technology. I've bemoaned the lack of any such alternative myself many times, but I disagree with the author about the desirability of having a *dominant* open-source alternative. I think there should be *many* open-source alternatives, none dominating the others. They should be sharing knowledge and pushing each other to improve, giving users a choice among complex tradeoffs, not delcaring themselves the new "de facto standard" before the revolution has even begun in earnest. We don't need another Apache or gcc stagnating in their market/mindshare dominance until someone comes along to push them out of their comfort zones. Being open source is not sufficient to gain the benefits of meaningful competition. One must be open in more ways than that.
P.S. wowfood, I've just started switching my own sites from nginx to Hiawatha, also mostly because of security. While I don't have any specific tips to offer (except perhaps one about rewrite rules that I'll blog about soon) you might be pleased to know that it's going quite well so far.
What you say is not true, googoobaby. There is a stripe translator, not enabled by default but only a CLI command away, that will stripe across multiple bricks (which can be on multiple servers).
Cost Benefit Analysis
First, I agree with AC#3 who pointed out that a Symantec marketing person might be a less than reliable source of information or insight on this issue.
Second, I think it's just as unreasonable to assume that encryption is too expensive as it is to assume that it's free. People should weigh both the costs *and* the benefits of using more vs. less secure storage, and measure those against realistic requirements. Most people and businesses should "default to secure" with respect not only to encryption but also to authentication, allowable locations and mandated retention/destruction of data, etc. because the cost/likelihood of compromise is just too high. If data has to traverse someone else's network or sit on someone else's storage, and performance goals etc. can be met with encryption, then encryption should probably be used even if the system would be "more efficient" without it.
Third, I'm hardly a disinterested party myself here. I'm the project leader for CloudFS (http://cloudfs.org/cloudfs-overview/) which addresses exactly these kinds of issues - not only at-rest and in-flight encryption which are both optional, but also other aspects of multi-tenant isolation and management for "unstructured" (file system) data. Of course, I'm not alone. The "senior partner" when it comes to storage security/privacy has to be Tahoe-LAFS (http://tahoe-lafs.org/trac/tahoe-lafs) which provides extremely strong guarantees in those areas at the cost of modest sacrifices in performance and functionality. Other entries in this area range from corporate-appliance players such as Nasuni and Cleversafe down to personal-software players such as SpiderOak and AeroFS. Enabling different tradeoffs between security, performance and usability is an active area of research and commercial competition, and we should all be wary of "this is the one answer" FUD.
Disclaimer: I'm an "associate" at Red Hat, but not speaking for Red Hat, yadda yadda.
EMC has nothing in the scale-out NAS space? Look, I worked on MPFS at EMC and developed a more-than-healthy loathing for the Celerra group. The Celerra might not scale out as much as the Isilon stuff, but it's architecturally not that dissimilar and it scales out plenty far for most folks. To say that EMC has *nothing* is simply inaccurate.
That said, I hope this rumor is not true. I had the privilege of working with Isilon gear and Isilon people some at my last job. I came away impressed, and it would be a serious shame if Isilon fell into the hands of the Celerra thugs. The likely outcome is that they'd pick over the technology for the few nuggets of IP that will solve their current self-inflicted problems, claim that the problems never existed and that they invented the IP themselves, then throw the rest along with all of the people in the trash. It would be an ignominious fate for such fine folks as Isilon has.
The claim of $100K for the entire system is not credible. A petabyte using the very cheapest commodity drives would cost approximately $85K, and that's just the storage. Unless Intel literally gives them 500 Atom processors for free, plus they get great deals on everything from that storage to memory and power supplies, plus they sell the thing for zero profit, $100K isn't achievable. My guess is that the reporter got things wrong, quoting a price for one system and a capability for another. Either that, or they're just snake-oil salesmen. Personally I think Smooth Stone - also linked from TFA - has a much more credible story.
Disclosure: I used to work for SiCortex, which was in a similar space. As far as I know there's no relationship (positive or negative) to either of these other companies, but I figured I'd mention it anyway.
Just what we needed
Thanks a lot, SwissDisk guys, for making sure that users will run away from a fundamentally sound and useful idea because of your lousy implementation/operations. You've screwed up not only your own business but other people's. Users, too, will now be so fearful that some of them will cobble together their own ad-hoc solutions instead and probably lose more data total than ever would have happened with cloud storage implemented and operated by competent people. In a just world, after all the damage you've caused, you'd be prohibited from ever offering services like this again.
The real nonsense is...
...the idea that because something is a cluster it doesn't have any intrinsic scalability issues. What bollocks. Lots of clusters have serious scaling issues because their communication protocols are poorly designed, leading either to a bottleneck at one "master" node for some critical facility or to N^2 (or worse) communication complexity among peers. It's not at all uncommon to find clusters that fall apart at a mere 32 nodes. Yes, it's also possible to design a cluster that scales better, but the difficulty is domain-specific. In a storage cluster, consistency/durability requirements are higher than in a compute cluster supporting applications already based on explicit message passing with application-level recovery, and the coupling associated with those requirements makes the problem harder. It's *possible* that XIV has solved these problems well enough to scale higher, but only an idiot would *assume* that they can or have.
As I already pointed out, it's a moot point this particular case because they don't need to, but in other situations the gulf between theoretical possibility and practical reality can loom quite large.
Pretty simple, really
Having worked with several parallel filesystems in the past, it never even occurred to me that there would be only one XIV. Shared-storage filesystems just aren't very common nowadays, and GPFS was never such. There will be many XIVs, connected to many servers, perhaps in slightly-interesting ways to facilitate failover and such but generally not much different than if the storage were entirely private to each server - the base model for most of the parallel filesystems in common use. Scaling XIV up or out was never necessary to support this announcement.
Now, for anonymous in #2: thanks for the IBM ad copy, but your claim of uniqueness for GPFS wrt knowing about multiple kinds of storage is simply not true. I'm no fan of Lustre generally, but it has long given users the ability to stripe files within a directory tree across a particular subset of OSTs. As of 1.8, they also added OST pools which give users even more control in this regard. PVFS2 and Gluster also offer some control in this area. Ceph is conceptually ahead of the whole pack (including GPFS), though it's still in development so maybe it doesn't count. In a slightly different but related space, EMC's Atmos offers even more policy-based control over placement. It's an area where GPFS does well, and it's a legitimate selling point - not that this is the place for "selling points" - but it's far from *unique*.