there can be no benchmarketing
30 posts • joined 27 Feb 2008
there can be no benchmarketing
Nimble employee here...
I sort of have to agree with the AC above calling this marketing tripe - sort of. The NV layer only becomes a bottleneck at extremely low latency rates. The change from the PCIe based NV layer to the NVDIMM saw write latency on Nimble reduced from ~.67 ms to ~.33 ms latency. The previous gen was already great for performance. The .34 ms reduction is not going to have much impact for most applications as it was already screaming fast - if you have an application that can benefit from a write latency improvement measured in microseconds - well, you're an interesting bloke, or at least have an interesting app.
The switch to NVDIMM did free up a slot on the PCI bus, allowing now up to 6 ports per controller - 10 GbE or 16 Gb FC. That was more impactful. But the switch came at the same time the new platform launched, taking Nimble from Nehelam family processors all the way to Ivy Bridge. Other internal changes enabled by the new chipset had an impact in the latency reduction as well.
In answer to the question - "Is it enough?" - well, no it isn't. But couple the NVDIMM with new micro-architecture and chipset features, more cores and RAM, and Nimble did manage to reduce latency and increase I/Ops to over 120,000 where the previous generation topped out about 70,000. So, yes, the controller is the bottleneck, not the NV layer.
Nimble never bothered doing a special press release regarding the change to NVDIMM in the second generation - it's just one component. A competitor saw fit to announce that they were planning to make a change to NVDIMM and tried to make a big splash with it. Nimble employees like myself felt compelled to point out that this was not a new innovation on the competitors part.
Nimble uses NVDIMM for the NV Layer in all the latest Generation product, which has been shipping for several months now. Write latency was reduced with the change, but even the NVRAM card on the PCI bus that the previous generation used (very similar to the NTAP architecture mentioned above) delivered sub millisecond latency.
The switch to NVDIMM frees up a PCI slot that can be used to add additional front end ports and also so far has been easily as rock-solid as the legacy PCI Flash solution it replaced.
it's full of stars
I condemn you as a forum spammer.
And a boring one at that.
Do data scientists wear white lab coats? That would be cool.
My apologies - I should not invoke a TLA without defining same . AFS = All Flash Shelf.
OK, Yeah, I get it. The idea of referring to it as Adaptive Cache really applies more to the ability to increase the ratio of the Cache layer relative to the Storage layer. With the AFS, you can change the ratio dramatically if needed, and because of the Infosite telemetry data, any recommendation or decision to increase the cache ratio is based on hard numbers, not any sort of guess work. We had some ability to do that by changing the four SSDs in the head unit to larger capacity units, but that only scales so far.
I think "Adaptive Flash" isn't a bad way to describe that new scaling dimension at all, as marketing stuff goes...I have heard worse in this biz..
Oh, and apologies to El Reg if that eye chart line card was supplied by Nimble - a CS-700 supports SIX shelves + 1 AFS in Max configuration, so a 4xCS-700 scale out config would support a potential of 24 expansion shelves and 4 additional AFS devices.
*Disclaimer - Proud Nimble Employee - but not in Marketing
Remember how long C mode took. And that worked out fine in the end, right?
Virtual arrays aside, and back on topic. Storage management tools don't need to SUCK. The problem is there insistence of an Enourmous Margin Corporation to put a line item on every single thing. Then again, if a customers willing to pay $10,000 for a SSD, they probably will cough up a few grand to support it....
Nimble Storage Infosite:
Cloud based telemetry of your entire storage ecosystem. Proactive modeling of the telemetry data so that problems or potential problems are identified weeks or months before they impact production. Performance, capacity, cache and cpu utilization and latency statistics delivered on a per volume basis. Even difficult to identify problems such as block misalignment can be identified.
Exportable results and even an executive summary area to help quickly justify when you do need to grow the environment. And more - and no agents needed for any of the telemetry.
All included as what we consider "Basic support"
Ironic for an article with Pure and EMC in the title, and Nimble in the very last paragraph...
Not all that ironic. Nimble was founded in part by Varun Mehta - who was employee 11 at NetApp. WAFL was very good stuff when Varun and his team built it - but it was constructed in a different age. It is not perfect but it is very good. CASL was written for the state of the world today - multi-core CPUs, large geometry disks, and flash media. WAFL had none of those advantages, so it is the superior file system if you rely on mid 90's hardware architecture.
Thank you so much for your opinion into the direction Nimble should take. Many large shops are on FC and have no desire to change. FC vs. iSCSI isn't as much as a performance discussion as it once was - Nimble delivers <1ms latency on iSCSI now - but sometimes we do need to addressees the top three layers (layers 8-10) of the OSI model.
For those that only are familiar with the classic seven layer model:
Layer 1: physical layer
Layer 2: data link layer
Layer 3: network layer
Layer 4: transport layer
Layer 5: session layer
Layer 6: presentation layer
Layer 7: application layer
The full model adds the business environment that solutions must exist in:
Layer 8: political layer
Layer 9: financial layer
Layer 10: religious layer
As a company gets into bigger and bigger shops and opportunities, Layers 8-10 can become barriers to entry. A classic example is the storage admin who will not allow iSCSI because IP solutions mean engaging the network team, and the storage guy thinks the network guy is a tool (Layer 8). Or the company has an existing FC implementation that it wants to leverage (Layer 9). Or the DBA insists he must have FC for mystical reasons (Layer 10)
I agree, at <1 ms latency, FC is irrelevant for most shops and iSCSI will work just fine. Adding file level protocols, on the other hand, opens a new can of worms. We have many customers deploying Nimble for file services, and the choice has usually been a native gateway server (physical or virtual) for the protocols needed. In particular, Windows 2012R2 has got REAL potential for this usage.
Call me crazy, but not until you've tested it...With well deployed multipathing on a 10 GbE iSCSI network and a Nimble CS-400 device behind it - I've seen performance that tripled the incumbent "Multi-protocol" solution that cost 5x as much as the new solution...and that solution was from one of the major players well known for their fantastic NFS product.
That customer runs on Nimble now.
Nice response ...I cannot imagine *WHO* would have a crazy system like that!
But we are hiring excellent SEs...
Nimble Employee here. NOT anonymous.
I will give you the benefit of the doubt. If you are NOT a Troll from some other storage vendor, you are reading off of a competitive placard that NetApp provided to their partners recently. I know that because one of those same partners just could not wait to share it with me. NetApp has played favorites between a few of their excellent resellers, and this excellent partner had several deal registrations denied. They went in with Nimble instead. We won - every time.
If you are not familiar with the term FUD, I will enlighten you. It is an industry acronym for "Fear, Uncertainty, and Doubt" and the stuff you are citing falls clearly in that category. Focus on the SuperMicro chassis, SATA drives, the two drive shutdown stuff...la la la. It's sad to see a terrific company like NetApp stoop to EMC style FUD-slinging - but I guess it is to be expected with the migration of so many excellent people from NTAP, and the hiring of so many ex-EMC Sales people and moreover, Sales management.
The FUD you are spreading is largely outdated, and/or irrelevant. Jabs at the hardware layer on Nimble and implying that the device is therefore unreliable completely belies the fact that we have completely documented >99.999% availability across our customer base.
The mechanics of CASL may be a bit past you, but with a bit of research, you could figure out for yourself why data services do not need to be paused against an SSD failure (note that we have seen exactly 2 SSD failures in our company history). SSDs are Flash media in a harddrive form factor - we do not treat them as hard drives, because they are not.
Erm - because two simultaneous drive failures are statistically *extremely* improbable? Two drives failing from internal defects within RAID rebuild times has not *EVER* happened in the field on any Nimble away, and we are talking about *thousands* of years of combined soak time between the arrays out in the field. And since we have the InfoSight™ telemetry data, we can state that authoritatively.
As was stated before - Two drives failing within RAID rebuild window indicates some external force acting on the array - water rising in the data center, crazed sysadmin with a sledgehammer, etc. In those cases, if two drives HAVE failed within RAID rebuild window, Nimble's view is that the third failure is imminent.
Rather than run with no parity, Nimble has made the design choice to protect the data on the SAN, which will ensure fast recovery whenever the external force is mitigated.
-Disclaimer : Proud Nimble Employee
If you look at an EqualLogic controller, it isn't exactly an x86 processor in there. The OS would need to be ported over to a different processor architecture. Why bother? EQL storage layer is really primitive - the network stack is sophisticated and elegant. To take it to x86 or x64 would mean re-writing all the good stuff for the new processor - and at the end of all the effort you have a mediocre to poor storage layer running on an unproven platform.
Again, why bother? Take what's left of EQL down in the basement of Round Rock 1 and put them out of their misery. It was a good product whose time has past.
I am wondering if someone at Microsoft took a look at some Dell DCS gear that has been in the M$ datacenter for the last 4 years while filling out the patent application.
If you're stuck with one of these things, it's probably a good idea to time those reboots for when the SPs are running < 50% load. Only high school football teams can perform at 110%....storage arrays cannot. And I agree with the AC above - is the problem the zillions of lines of the 24 year olde Clariion code, or the 64 bit parts that were strapped onto it so that marketing could call it 'VNX2'?
Focusing on the myriad shortcoming of Dell is missing the point. Michael Dell is a true innovator - but not in technology, rather in financial models. He pioneered the "Negative cash conversion cycle" and made it possible to do financial magic - manufacture products in such way that it was possible to sell below cost and still make money - not an easy trick. Several business schools teach courses on the Dell model.
It so happened that for his time in history, the PC was the best platform to exploit that model, and so he ended up in the PC business. In some other age, it might have been buggy whips. Between the negative cash conversion cycle and the direct model of Sales, he built a very successful company.
However, almost point for point, Dell in latter years has abandoned all of the practices that made them a multi-billion dollar company. First off, they attempted to embrace the channel, which is absolutely counter to the culture of the company in the first place (Please read Michael's book "Direct from Dell" for more on that) . Simultaneously, they stopped being a manufacturer and instead have shifted to an ODM model for the core products of servers and PCs.
I have no doubt Michael has a plan. I also have no doubt that he believes his plan would not be workable if he must show quarter on quarter growth to the street. I suspect the plan will not bode well for current resellers, and, if I hadn't already left the company, I would be taking the golden handshake now to get out. I am anticipating a big round of arbitrage where Dell spins off a bunch of the underperforming questionable acquisitions it made over the last few years - and there will be collateral damage across the company when that show starts. I also anticipate the channel teams will be cut further than they already have been. We're already seeing the Dell direct teams taking deals away from the channel daily - Sales upper management never really saw the value of channels , they have drunk the "Dell Direct" Kool-Aid too long.
The spread from #1 and #2 in x86 has been razor thin for years, and a few big deals will sway that claim easily. Dell does build gorgeous hardware - and those big deals are built on the DCS hardware that you can't get at the website. Couple that with a willingness to take a large deal direct in a heartbeat in order to recover the incredibly thin or negative margins, and the special thigh grease (to help the sales rep lower his trousers quicker) and special socks with ankle handles for a better grip when pricing to "take share", and I am only surprised it took this long...
Good luck with that services play, though. To err is human, but to really screw things up you should call Dell Professional Services..
One of the top 50 places to have once worked at.
While they were fixing the mid-plane, it would have been awesome to maybe make the damned display so it could tilt upward - for those of us that have installed the enclosures at the bottom of the rack.
Yeah, I know you have to be an anonymous coward if you are going to post as a Dell employee, because otherwise you have violated one of the many rules over there. That's fair. I was at Dell, worked in both the Server as well as the Storage practice, and more than likely trained you at one point. Now I work at Nimble, because it is a far better product and better company to work for - which is why many of the best senior Compellant and EQL sales people now work here.
So, let me go ahead and point out all of the things wrong with your statement - non-anonymously.
Equallogic is in second generation - So is Nimble, but Equallogic's second generation cannot match the performance or functionality of out original Beta systems, never mind the 192 TB of capacity (all at 45,000 IOps) that the current Nimble does.
Price performance - what math are you using to determine that Dell has something more suitable than Nimble? Dell can solve for $$ per GB storage with the cheapo MD solutions, but performance - $$ per IOP - sucks. I realize you cannot do the math for what it would take to get 45,000 I/Ops out of EQL, because you cannot get a large fraction of that performance number out of EQL platform at all without going into the very poorly thought out PS6xxxS chassis - then your $$ per GB is through the roof. The EQL hybrid arrays - trying to solve the problem with hardware - will only do a fraction of the performance of our original CS-2x0 arrays.
And, add the data protection functionality - just built into the array, and just works - and EQL becomes an incomplete solution unless you want to start slinging AppAssure at the solution, and a backup target... Complexity and a hodgepadge of acquisition products - which sums up Dells Enterprise strategy as a whole.
But what do you expect ? Dell is not a technology company - it is a marketing company (Check your pay stub for verification)
VMWare and Microsoft Integration. Nimble has all of that , and feedback from my customers (many of them ex EQL) is that out VMWare integration is easier to deploy and impliment than EQL. My opinion - they are about the same.
Your problem, and Dell's, is simply this - What makes Nimble so different and disruptive to old school incumbents like Dell is the basic file system itself - Cache Accelerated Sequential Layout (CASL). Dell cannot change how they are injesting and storing data on EQL or Compellant without a complete redesign,
Disclaimer - Nimble Employee, and ex-Dell/EqualLogic employee.
I'm not clear on where you get that Nimble is where EqualLogic was in 2009. EqualLogic in 2012 cannot outperform a single CS-200 series array (never mind the new CS400s and CS-400x2s) and follow their own best practices, even with a million dollar plus 16 array pool. - the performance is THAT much better on the Nimble.
EqualLogic's snapshot capability is so inefficient it's embarrassing - a single 4k changed block will consume 15 MB of non-compressed capacity. By contrast, Nimble's granularity goes down to a single 4k page view. Nimble's model of using efficient snapshotting as a data protection/backup methodology will forever escape them, as their ancient file system and incredibly underpowered processors will not allow it. To fix it would mean abandoning their install base and code and starting over from the basics- then THEY would be where NIMBLE was in 2009...
Replication is even more inefficient on the EQL platform, compared to the WAN optimized, highly granular Nimble model.
EqualLogic did a few things VERY well - they introduced a user administrable, easy to deploy SAN and blazed a trail for iSCSI protocol. Their network stack is very nice. Pity they are selling storage, not networking.
As I said, EQL did a few things very well, and that led to them once being considered the fastest growing storage startup in history. That crown has been moved to Nimble by a LARGE margin - any way you want to measure it - Revenue, number of customers, number of installs, hiring rate... (Hint : The Registers numbers are a bit old - over 1000 installs and 500 customers as of last week).
And about that gateway product - Cacher - well, It was REALLY freaking cool, but there is not not a very large addressable market for super duper high performance NFS.
"Do they think buyers are stupid" - Actually, yes, they do.
The embedded hypervisor in the Dell VESO (aka R805) and now all PowerEdge models means you can deploy servers with no local hard drives at all, saving power, reducing heat, and improving uptime with one easy move.
Sounds sort of revolutionary to me...