back to article You lead the all-flash array market. And you, you, you, you, you and you...

How is everyone doing in the all-flash array-flinging stakes? Well, you can call Pure Storage Mr Hugo because it's still the victor. Gartner's box-fillers have once again jammed seven of the makers into the top right-hand square in its yearly analysis. NetApp, which was one of a group of three chasing Pure around Gartner's …

  1. Androgynous Cow Herd

    X-IO is killing it!

    Apparently it is more difficult to be a niche player than a leader, since only X-IO can manage it.

  2. CheesyTheClown

    What's the value anymore?

    Ok, here's the thing... all flash is generally a really bad idea for multiple different reasons.

    M.2 Flash has a theoretical maximum performance of 3.94GB/sec bandwidth on the bus. Therefore a system with 10 of these drives should be able to theoretically transfer an aggregate bandwidth of 39.4GB a second in the right circumstances.

    A single lane of networking or fibre channel is approximately 25Gb/sec which is less than 1/10th of the bus bandwidth of a drive. So in a circumstance where a controller can provide 10 or more lanes of bus bandwidth for data transfers, this would be great, but this numbers are so incredibly high that this is not even an option.

    So, we know for a fact that the bus capacity of even the highest performance storage systems can barely make a dent in a very low end all flash environment.

    Let's get to semiconductors.

    Let's consider 10 M.2 drives with 4 32Gb Fibre Channel adapters. This would mean that a minimum of 72 PCIe 3.0 lanes would be required to allow full saturation of all busses.

    This is great, but the next problem is that in this configuration, there's no means of block translation between systems. That means that things like virtual LUNs would not be possible.

    It is theoretically possible to implement in FPGA (DO NOT USE ASIC HERE) a traffic controller capable of handling protocols and full capacity translation using a CPU style MMU for translation of regions of storage instead of regions of memory, but the complexity would have to be extremely limited and because of the translation table coherency, it would be extremely volatile.

    Now... the next issue is that assuming some absolute miracle worker out there manages to develop a provisioning, translation and allocation system for course grained storage, this would more or less mean that things like thin provisioned LUNs would be borderline impossible in this configuration. In fact, based on modern technology, it could maybe be possible with custom FPGAs designed specifically for an individual design, but the volumes would be far too low to ever see return on investment for the ASIC vendor.

    Well, now we're back to dumb storage arrays. That means no compression, thin provisioning, deduplication and without at least another 40 lanes of PCI 3.0 serialized over fibre for long runs, there's pretty much no chance of guaranteed replication.

    Remember this is only a 10 device M.2 system with only 4 fibre channel HBAs.

    All Flash vs. spinning disk hybrid has never been a sane argument. Any storage system needs to properly manage storage. The protocols and the software involved need to be rock solid and well designed. FibreChannel and iSCSI have so much legacy that they're utterly useless for modern storage as they don't handle real world storage problems on the right sides of the cable anymore. Even with things like VMware's SCSI extensions for VAAI, there is far too much on the cable and thanks to fixed sized blocks, it should never exist. If nothing else, they lack any support for compression. Forget other things like client side deduplication so that hashes for dedup could be calculate not just for dedup, but for an additional non-secure means of authentication.

    Now let's discuss cost a little.

    Mathematics and physics and pure logic says that data redundancy requires a minimum of 3 active copies of a single piece of data at all times. This is not negotiable. This is an absolute bare minimum. That would mean to have the minimum requirement for redundant data, a company should have a minimum of 3 full storage arrays and possibly a 4th for circumstances with long term maintenance.

    To build an all flash array with a minimal configuration, this would cost so much money that no company on earth should ever piss that much away. It just doesn't make sense.

    The same stands true of fibre channel fabrics. There needs to be at least 3 in order to make commitments to uptime. This is not my rule. This is elementary school level math.

    Fibre channel may support this, but the software and systems don't. It can be done on iSCSI, but certainly not on NVMe as a fabric for example. The cost would also be impossible to justify.

    This is no longer 2010 when virtualization was nifty and fun and worth a try. This is 2018 when a single server can theoretically need to recover from failure of 500 or more virtual machines at a single time.

    All Flash is not an option anymore. It's absolutely necessary to consider eliminating dumb storage. This means block based storage. We have a limited number of storage requirements which is reflected by every cloud vendor.

    1) File storage.

    This can be solved using S3 and many other methods, but S3 on a broadly distributed file system makes perfect sense. If you need NFS for now... have fun but avoid it. The important factor to consider here is that classical random file I/O is no longer a requirement.

    2) Table/SQL storage

    This is a legacy technology which is on its VERY SLOW way out. We'll still see a lot of systems actively developed towards this technology for some time, but it's no longer a prefered means of storage for systems as it lacks flexibility and is extremely hard to manage back end storage for.

    3) Unstructured storage

    This is often called NoSQL. This is a means that all systems have queryable storage which works kinda like records in a database but far smarter. So the data stored is saved as a file, but the contents can be queried. Looking at a system like Mongo or Couchbase shows what this is. Redis is good for this too but generally has volatility issues.

    4) Logging

    Unstructured storage can often be used for this, but the query front end will be more focussed on record ages with regards to querying and storage tiering.

    Unless a storage solution offers all 4 of these solutions it's not really a storage solution it's just a bunch of drives and cables with severely limited bandwidth being constantly fought over.

    Map/Reduce technology is absolutely a minimum requirement for all modern storage and this requires full layer-7 capabilities in the storage subsystems. This way as nodes are added, performance increases and in many cases decrease overhead.

    As such, it makes no sense to implement a data center today on a SAN technology. It really makes absolutely no sense at all to deploy for example a containers based architecture on such a technology.

    If you want to better understand this, start googling at Kubernetes and work your way through containerd and cgroups. You'll find that this block storage should always be local only. This means that if you were to deploy for example MongoDB, SQL servers, etc... as containers, they should always have permanent data stores that require no network or fabric access. All request will be managed locally and the system will scale as needed. Booting nodes via SAN may seem logical as well, but the overhead is extremely high and in reality, PXE or preferably HTTPS booting via UEFI is a much better solution.

    Oh... and enterprise SSD is just a bad investment. It doesn't actually offer any benefits when your storage system is properly designed. RAID is really really really a bad idea. This is not how you secure storage anymore. It's really just wasted disk and wasted performance.

    But there are a lot of companies out there who waste a lot of money on virtual machines. This is for legacy reasons. I suppose this will keep happening for a while. But if your IT department is even moderately competent, they should not be installing all flash arrays, they should instead be optimizing the storage solutions they already have to operate with the datasets they're actually running. I think you'll find that with the exception of some very special and very large data sets (like a capture from a run of the large hadron collider) more often than not, most existing virtualized storage systems would work just as well with a few SSD drives added as cache for their existing spinning disks.

    1. Anonymous Coward
      Anonymous Coward

      Re: What's the value anymore?

      People generally don't buy flash because of its bandwidth capabilities. They're more interested in latency.

      In fact, very little of what you say actually makes sense and is incoherent at best. You remind me of Spud from Trainspotting in the job interview scene.

      As for "Mathematics and physics and pure logic says that data redundancy requires a minimum of 3 active copies of a single piece of data at all times. This is not negotiable. This is an absolute bare minimum."

      Yes, that old trick. State "fact" and claim it's "not negotiable", even though you've provided no evidence to back up your fact. The number of copies is a small part of it. What's more important is the availability of each component, how long it takes to recover from a failure and the impact of the failure and its recovery.

      I do agree with you that n+2 is a good idea. I just dispute that it's as simple as that.

      "Map/Reduce technology is absolutely a minimum requirement for all modern storage"

      Why? Most workloads won't benefit and it will add cost and complexity (and hence risk)

      I know I'm being critical here. You do raise some interesting points, but while your post is lengthy it seems to be somewhat narrow and superficial.

      1. Alex McDonald (NetApp)

        Re: What's the value anymore?

        Old saying; bandwidth is an engineering problem, only God can fix latency. Agreed, It’s all about the latency.

        With flash we’re getting a factor of up to 10^3 improvement over the low milliseconds we’ve had for the last few decades. That’s a big difference.

    2. Androgynous Cow Herd

      Re: What's the value anymore?

      Your post is too long. Please condense it down to one sentence.

      Thanks in advance.

      1. Giovani Tapini Silver badge

        Re: What's the value anymore?

        @Androgynous Cow Herd

        What sort of PHB are you wanting all that in a single line answer?

    3. Mr.Nobody

      Re: What's the value anymore?

      All flash arrays, as stated in another reponse, are really successful and worthwhile because of the low latency. This is especially important for what I am sure you would deem legacy systems.

      One can now switch from a hybrid flash/disk based system for their production environment and move to all flash. Any and all latency issues that previously existed essentially disappear. After the switch in our environments, we had people pleasantly surprised at how much faster things were in general, almost to the point that things ran as fast as they thought they always should.

      We were able to make this remarkable change overnight without changing a single line of code on any of our applications, for 50% less than we paid for the Hybrid system, and it takes up 1/10 of the rack space.

      As to requiring three copies. That sounds appropriate if you are dealing with non-raid technologies, but the notion that having a production storage system with built in redundancy (raid with double parity and spares, failover/spare controllers and ports) and a DR system in another location is more than adequate.

      What you seem to be advocating is to build applications with a new infrastructure design, but most companies do not have the time or resources to rewrite everything from scratch and stick it at AWS, so finding other ways to increase performance and efficiency on the old stack is of paramount importance, and flash arrays are as impactful as hardware virtualization was a decade ago.

    4. Anonymous Coward
      Anonymous Coward

      Re: What's the value anymore?

      I can only assume Cheesy the Clown's post was a very lengthy effort at parody/satire.

      I admit there's one segment of the TAM that has a use case that lines up with his fan fiction, and it's comprised of about 5 companies.

      Also, not everyone there is using traditional RAID anymore, some are using erasure coding, but for simplicity they simply call it "RAID" or "Triple-Parity" because everyone understands that concept quickly without having to read a white paper on it. But it's a proprietary data protection implementation. Yes, some still have legacy RAID as an OPTION, and we can all agree that sucks.

  3. Anonymous Coward
    Anonymous Coward

    Gartner Magic Quandrant?

    This is interesting. There are quite a few AFA vendors out there with some pretty good kit that nobody knows about.I never see them mentioned by Gartner at all. Several in Israel I am using and one from Fremont CA called AccelStor Solutions who make what must be the fastest RAID free platform that delivers an astounding 732K WRITE IOPS with Data Protection technology called FlexiRemap that won a best of show award at the last FMS in Santa Clara that didn't burn down.

    All from SATA SSD at a price lower than spinning disk!! It's a good time for us users of Flash Storage systems right now with these newcomers NOT charging ridiculous fees for capacity and for the services we know and love like clones, snaps, replication and DataDedupe almost as good as Pure's dedupe.

    Their new NeoTopaz offering in the NVMe-oF class of All Flash array lets you build virtual all flash arrays to your exact spec and needs!! Kaminario is the only other vendor doing this. AccelStor Solutions also have true active-active share nothing storage architecture that syncs with the other virtual flash arrays using 100GbE as the transport for RDMA called RoCE. The virtual array performance on RoCE is over 1 Million IOPS for random ops writes!

    These new vFlexiArrays also use the RAID free FlexiRemap technology in their FlexiSuite software that comes with each instance.

    Being a career EMC VMAX and HDS VSP bigot with a strong admiration for Nimble and Pure products as well I am truly enjoying the range of awesome options out there from these newcomers.

    For Openstack and Parallel file system support like BeeGFS these newcomers are one of the few with full Cinder support as well. The Israeli products are a mirror in terms of quality and performance with awesome pricing that is truly astounding with the features you get.

    Still cant understand why many of these Flash vendors on the Gartner Magic (sic) Quadrant are using RAID on Flash. This is not a great technology fit. RAID = Spinning disk, we need more vendors with Data Protection designed from the ground up to cater for NAND Flash goodies.

    Some storage vendors are waiting for RAID controllers for NVMe before they dive into NVMe which is a truly astounding waste of time. Thats actually so sad I am compelled to cry in frustration over it..

    Look at how poor Dell EMC Unity and Pure M series do on performance vs a NeoSapphire SATA SSD platform and swiftly conclude that good old fashioned RAID is robbing those Flash systems of some serious performance potential, even with current PCI-e 3.0 bandwidth constraints taken in to account.

    Good job Purity is so slick and awesome, I think they would be having some serious issues if they had come to market the same time as these newcomers.....

    Say , is RAID 3D a parasite on top of RAID 6x containers or whut????

    Things in Nand Flash technology to worry about by Flash the roadrunner.....

    1. Anonymous Coward
      Anonymous Coward

      Re: Gartner Magic Quandrant?

      RAID Free Flash? About time someone starting talking about the obvious issues with transplanting old technology onto new tech!

      Now we are talking!!

      Actually, this FlexiRemap stuff is pretty interesting... I must consult with the two ex ePlus storage gods who are working at AccelStor these days to find out more. They were both EMC VMAX and Pure bigots last time I heard either of them talking about storage.

      The vFlexiArray thing also sounds kinda neat.

      I have an idea RoCE is going to be a pretty big deal for storage over the next five years.

      I wonder what Nutanix are doing with RoCE?

  4. Oneman2Many

    Also need to think about IOPS which is a much bigger factor than bandwidth and of course density with 15TB 2.5" drives already available and 30TB drives have been announced . Added on that flash memory allows re-dup, compression and potentially encryption in situations where spinning simply can't there are plenty of use cases for flash storage. Not saying its the right solution for every situation but there is plenty of situations where it now is the right solution.

    1. Anonymous Coward
      Anonymous Coward

      SED Options

      Yeah, AccelStor Solutions and these Israeli storage companies also offer SED SSD options, no additional key management system required and I measured the performance impact and its less than 1% over the standard SSD on all of them, in fact its hard to say if there even is a performance impact.

      Got some of the new 30TB drives to play with but the IOPS on these in an array seems a lot less than the 3.84 and 7.68 drives. Makes sense, its bigger?

  5. Korev Silver badge
    Childcatcher

    Trinti

    I see that Trinti's "ability to execute" is >0; odd as they're having a few cashflow problems (until DDN dived in at least)

    1. Anonymous Coward
      Anonymous Coward

      Re: Trinti

      Disturbing, no matter how you spell it out

  6. WYSIWYG650

    Can someone please explain the logic?

    How is Pure above and to the right of NetApp? Who sells more is a great example and evidence of ability to execute. Completeness of vision is the next category and let's talk about that for a moment. Name one feature or capability that Pure can do that NetApp cannot? I'll predict the feedback of ease of use and maybe reporting benefits but I would argue those strengths of Pure do not even come close to the long list of things that NetApp can do that Pure can only dream of. Like real features that took many years of investment...cloud capability, data mobility, multi-protocol support, and more. Pure does a fantastic job of marketing and partnering and the architectures are fast but silo'd and limited in scale out and software maturity. It's good for NetApp and the market to have a strong competitor like Pure. EMC and HPE are weak at best and do not compete any where near as much as they did in the recent past.

    1. FlashTheRoadrunner

      Re: Can someone please explain the logic?

      Yeah, I don't get it either. In fact, I am starting to question and conclude that the performance on a Gartner Magic Quadrant has nothing to do with the technical reality salted down to the raw bits and bytes as the NetApp stuff, apart from running RAID, is pretty much a very mature platform set when compared to Pure and their performance blows Pure M series out the water....(A series or the EF stuff)

      FlashBlade, however, is a pretty breathtaking effort that is astounding to me at any rate. Ironic that NetApp direction and strength is into block storage and Pure's strength and forte is FlashBlade NAS.....If only NetApp would dump RAID we might have something significant across the board here.

      Have a good hard look at FlexiRemap, Data Protection designed for Flash from the get go. It's not rocket science, its just common sense.

      Its like looking for sails and masts with all the rigging on an aircraft carrier because that's how you understand that ships are supposed to work...RAID is yesterday's Hero and that's a FACT!

      1. WYSIWYG650

        Re: Can someone please explain the logic?

        Dont forget Solidfire and NetApp HCI both use Element OS which is, wait for it.. Self-healing Helix RAID-less data protection. boom, drops the mic

    2. Equals42

      Re: Can someone please explain the logic?

      NetApp has a lot of features the Pure doesn’t and I think have the better platforms. To answer your question though, Pure has their active/active cluster which lets you write to both sites fairly simply. It’s much simpler than ONTAP MetroCluster and doesn’t rely on whole system failovers and such.

      The Pure AI stuff is fluff and repackaging basic capabilities with a new PowerPoint deck. They still don’t have a real NVMe over Fabric solution for sale yet but “it’s coming” the future. Not sure how Gartner rates things, but whatever.

      EMC still doing the forklift upgrade thing? Want NVMe or deduplication with flash on a VMAX? Here’s a truck full of new kit for you. Have fun migrating legacy systems onto that for the next 12 months! Where’s the PO?

  7. froberts2

    Gartner's MQ is based around what the vendors tell Gartner, which means that the company that focuses the most on marketing, wins. And in this space, that's Pure Storage.

    If Garner was to actually test any of the product, rather than simply believing what the vendors say, the results would be very different.

    1. Anonymous Coward
      Anonymous Coward

      Yes indeed. I see that in my lab with my Load Dynamix testing rig every single time I purvey a new offering.

      Its usually pretty astounding what comes out in real word testing vs the claims these guys all make, especially the well established vendors.

      It seems to be a fact that slicker marketing engines win deals despite average real world tested metrics.

      Good Marketing manages perception, and perception becomes reality if it goes well.

      Sad but true unfortunately....

      Just look at Violin Memory and Microsoft as stark contrasts of the point under discussion...

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019