back to article Proprietary: Pure sticks to flash module design, becomes a direct flasher

Pure Storage has moved away from SSDs in giving its FlashArray an NVMe makeover, halving its latency and doubling its bandwidth. The new FlashArray//X takes inspiration from the unstructured data FlashBlade’s proprietary flash drives and re-implements these in NVME flash form to build an updated FlashArray//m. A FlashBlade …

  1. baspax

    Impressive. Nice work.

  2. Anonymous Coward
    Anonymous Coward

    Gee, that sounds JUST like what DSSD was doing...

    1. Anonymous Coward
      Anonymous Coward

      How so? I thought DSSD was doing top-of-rack direct-connect acceleration for specific servers. Also recall that it was brutally expensive, and very exotic, but not very feature-rich.

      This seems to be more tailored to enterprise storage by connecting to more servers via typical network infrastructure (iSCSI and Fibre Channel). Also looks like it runs the same software as the other Pure boxes, so it provides the software services (compression, dedup, encryption, snapshots, cloning, replication, etc...).

      I was not aware that DSSD had matured to this point.

      1. RollTide14

        I think the price point with DSSD was very high, but I think the point the other person was trying to make is that propietary hardware dosent have a great track record (DSSD, Violin)

        1. Anonymous Coward
          Anonymous Coward

          RollTide14, you may well have a point, and thanks for causing me to see the statement from another perspective. The position of proprietary or custom hardware having difficulty is true of some and not as true of others (Hitachi makes their own FMDs, as does IBM for some of their TMS-descended systems). It seems the sticking point on a lot of these historically has been that the hardware and the protocols necessary to leverage it were proprietary. With this system, it seems that Pure has built a drive, but is using the industry standard NVMe protocol to address the drive. In a way, Samsung, WD, Toshiba, and Intel produce proprietary drives with open protocol connectivity. Certainly the secret sauce of each of these manufacturers is in their firmware, true?

          I'd be interested to see if the economies of scale that the drive manufacturers can leverage ultimately mitigates against Pure building their own drives in the future, and whether or not their system will permit the flexibility to adopt lower-cost mass-produced items in the future?

          1. RollTide14

            Its going to be very interesting on a couple different fronts. Can they manufacture their own drives at a price point where they can be cost competitive in the market? Does development now slow down with proprietary HW (ala 3PAR ASIC)? Finally, the DSSD, Violin's of the world that used proprietary HW did NOT offer any data services (Dedupe, Compression) because that inherently slows the array down. Did they fail because of the HW or because of the lack of services?

        2. Anonymous Coward
          Anonymous Coward

          That was indeed the point. DSSD similarly had extremely high-performance, if albeit extremely high-priced, proprietary flash modules in a proprietary enclosure with a proprietary NVMe-like direct PCIe connect. Admittedly, there were differences; DSSD was designed as a rack-scale solution and never intended to be a general purpose file/print array with dedupe, compression, replication, leather seats, and cupholders. When sold, implemented, and used as designed, it did one thing and did it well - store and retrieve data really, really fast.

  3. ssog

    Improvement against //M but is it really revolutionary?

    50% lower latency, 2x bandwidth, 4x performance density.

    Let's look at it in reverse order. 4x performance density based on the fact that they need exhaust the performance of the controllers using 10 modules Vs 44 in case of //M. The reality is that the peak performance is still limited by the controllers. There is no change to that. The number of drives/modules would anyways vary from customer to customer based on their capacity needs. Going from 44 to 10 modules may be an improvement for //M customers but not necessarily against other flash vendors which do a lot better than the 44 flash modules even today.

    2x bandwidth at 65% reads on a 32K block size.No real numbers shared except for relative comparison. This is great for a specific workload. It is also important for a scale-up array like Pure but this is not necessarily a challenge for others especially vendors that have a scale-out array where peak performance of a node doesn't matter. At the end of the day, what matters is the $/IOP.

    50% lower latency.Again, no real numbers shared except for relative comparison. If I am able to get 0.8ms latency today on an all-flash system and I can get 0.4 ms on //X, is it low enough for me to pay the premium? Does my application need the extra 0.4ms?

    Higher Usable with custom modules -

    Pure with their //M has one of the worst usable space per drive. With custom modules they get to ~57% which is also lower than the typical >65% efficiency that most other all-flash systems can get.

    Let's say Pure has incrementally better performance than anyone else out there (until others get to NVMe - not very long!). What do they bring to the table except performance, dedup/compression and proactive support through Pure1? The last I checked, that was table stakes.

    1. Anonymous Coward
      Anonymous Coward

      Re: Improvement against //M but is it really revolutionary?

      I believe the stats are compared to the //m array, which gives you a better point of reference.

  4. froberts2

    The numbers don't add up...

    183TB raw. Add in data protection, over-provisioning (which Pure has always done at the array level, but now needs to more than ever as they lose the added benefit of SSD over-provisioning), and other overheads and they lose probably close to half that. (~45% in the //m, //X will be at least the same, probably more)

    Even if it's 100TB usable, that means they are counting a 10:1 data reduction to get to 1PB - almost double what they have claimed previously without any logic backing it.

    1. Anonymous Coward
      Anonymous Coward

      Overhead is less and overprovisioned space is much less per the article. That said, 1PB is expected with the 18.3TB modules x 20. Factor in OH then multiply by 5ish:1 you'll find 1PB.

      1. flyguy959

        This is the thinking they used. See the blog post today on the Pure site about DirectFlash Enabling Software And Flash To Speak Directly and you'll see an example with the Purity/RAID 3D overhead worked in on a 9.1 TB DirectFlash Module. Usable from that is 5.23TB. So for 18.3TB DFM you get over 10TB usable. Their global average dedupe rate, which they publish live on the site under the Products-Purity section, is ~5:1. There's 20 DFMs. 10x20x5=1000.

        1. ManMountain1

          Still doesn't add up. Each 3U can hold 10 x modules = 183TB raw. But the 1PB effective by your maths would need 20 modules. Either the register or Pure are being a bit misleading. The 1PB effective, based on 5:1 would be in 6U, not 3U.

          In fact reading back they are definitely claiming 1PB effective in 3U (ie. 10 modules) so they ARE claiming 10:1 which is very 'optimistic'.

          "1PB effective capacity in 3U, 183 TB raw!" ... Hmmmm.

          1. MortenMadsen

            FUD or did you not do your homework?

            Yes it does add up (if your data can get you a 5 to 1 reduction)

            Each chassis (3 units) hold 20 flash modules or 20 direct flash modules. Each capacity pack is 10 modules but the chassis can hold 2 capacity packs = 20 modules.

            Look at their specs:

            http://www.purestorage.com/content/dam/purestorage/pdf/datasheets/ps_ds6p_flasharray_01.pdf

            PS: I am pretty sure of the above as I have 3 FlashArray//M systems :)

            1. ManMountain1

              Re: FUD or did you not do your homework?

              The article definitely says 3U = 183TB raw = 1PB effective. How 5:1 data reduction can turn 183TB RAW, into 1PB effective is very hard to comprehend.

              1. chulak

                Re: FUD or did you not do your homework?

                Pure employee here

                ManMountain1, it looks like the confusion is coming from the following line in the article:

                1PB effective capacity in 3U, 183 TB raw

                This is not the case, looks like the author multiplied 10x18.3 instead of 20x18.3. If you look at Pure's graphic and also the second to last paragraph in the article, you can see that the 1PB effective capacity is based on 20 18.3TB drives at a 5:1 data reduction ratio. The output of the math adds up, just make sure you are using the right inputs ;)

                1. spinning risk

                  Re: FUD or did you not do your homework?

                  5:1 Data Reduction from Pure is a stretch especially when they only guarantee capacity for 6 months on their array. Also, 1PB effective with 2 controllers? good luck! If they could scale out they might have a chance to keep latency low but they cant..Pobrecito..

  5. Anonymous Coward
    Anonymous Coward

    Where o' where ... is Pure's Evangalist... with amazing hair

    I need some commentary on this announcement from vaughn.

    1. Anonymous Coward
      Anonymous Coward

      Hold on!!?!

      Let me see what you are claiming

      1. New controller + backplane (yaaawn).

      2. DirectFlash NVMe module (why)

      3. DirectFlash Software (SW to abstract out NVMe)

      http://blog.purestorage.com/a-new-flash-recipe-for-the-cloud-era/

      1) Pure got lucky because HPE bought Nimble and because Tegile and Tintri are private. Rest are copy cats.

      http://blog.purestorage.com/flasharrayx-making-nvme-mainstream/

      2) You are claiming making NVMe mainstream, but it's standards based storage using PCIe connection to a custom storage processor. NVMe is already mainstream. It's available in laptops. Intel makes NVMe commodity. Not sure what is your innovation in the world of storage. Fusion-IO and Virident are no longer applicable because of NVMe (in case you need a NVMe primer since it seems you are unique claiming QDs of 65,000 -- http://www.computerweekly.com/feature/Storage-101-Queue-depth-NVMe-and-the-array-controller). Heck in your previous blog you talk about Facebook releasing Lightning. Admit this is a bunch of fluff and move on.

      http://blog.purestorage.com/directflash-enabling-software-and-flash-to-speak-directly/

      3) Flash can directly talk to software (running on storage controllers). This is not new. Don't claim this as unique innovation. Please see Violin Memory's VIMM (here is a link in case you Pure folks want to read it -- http://cdn.violin-memory.com/wp-content/uploads/resources/Violin-Whitepaper-Flash-Fabric-Architecture.pdf). It's cool that it is for NVMe and has features such as adaptive flash for predictable operations on flash. am not sure if others have done this, but just admit a lot of your DirectFlash story is marketing and move on.

      I agree that you have revolutionized the front end by abstracting out the back, Violin goofed up on the higher levels of data management because they didn't have the insight, but they did DirectFlash before Pure. Anyone from Violin Memory's former management reading this--why you didn't buy Datacore for SANsymphony is beyond me?

      1. Anonymous Coward
        Anonymous Coward

        Who else does Flash Storage QoS at the flash die level? Isn't this done at higher levels because it's akin to worrying about machine code in this day and age.

      2. Anonymous Coward
        Anonymous Coward

        Too funny you say there is no innovation. Did you read: https://www.theregister.co.uk/2017/03/13/pure_drops_to_fifth_overtaken_by_ibm/?

        Pure coming in from nowhere and taking 5th place for flash is an endorsement of their customers trusting their products.

        1. ManMountain1

          Page doesn't exist! I think you mean

          https://www.theregister.co.uk/2017/03/13/pure_drops_to_fifth_overtaken_by_ibm/

          Pure coming from nowhere? AFA's are relatively new and arguably Pure was one of the first to market! And the article says the current 10% is the 3rd period its' share has dropped in a row. Suggests that in recent times, since the other vendors have got their acts together, Pure aren't doing so well?

          1. Anonymous Coward
            Anonymous Coward

            Makes sense. Everyone in this space is playing on IT's nostalgia of technologies. If all things remains the same in current time, people will choose a known path over an unknown; even if the future tells them to shift.

            Devil you know vs. devil you supposedly don't.

            Fibre Channel is the 'coal burning' or fossil fuel based power plant of the storage world. iSCSI and NAS might be considered nuclear power -- efficient and misunderstood because of some vendors purposely screwing up implementations to protect Fibre Channel like coal or fossil fuel.

            1. Anonymous Coward
              Anonymous Coward

              Apples and oranges. Block storage by definition has far less overhead than iSCSI and NAS (which lay their protocols on top of block); neither are particularly efficient compared to native SCSI-block-on-FC, especially when it's well-tuned and multipathed. They're more convenient but always slower. There's also something to be said for the stability and maturity of all three protocols vs. the wild new world... it's nice to be at the bleeding edge, but I'm not sure I want my flight tracking, medical records, or paycheck to be stored on the latest-and-greatest.

              1. Anonymous Coward
                Anonymous Coward

                Disagree! If we adopted that same methodology, Google, Facebook and Twitter programmers would exclusively use Assembly or at least C and no one would use 4G languages. Ongoing education is important for IT personnel on what are the alternatives (and not alternative facts). Keep things stable, but not stuck in a time loop.

                1. Anonymous Coward
                  Anonymous Coward

                  In the best of all worlds, those programmers would use Assembly and/or C for tasks where they're appropriate and 4G languages for tasks where they are appropriate. Same thing here - use NFS / iSCSI where you need attributes peculiar to those protocols (for example, cross-platform compatibility, IP transmission) and block where you need attributes peculiar to block (for example, raw speed).

                  Point being that dismissing FC and SCSI as "fossil fuels" while touting iSCSI and NAS as "nuclear power" makes as much sense as dismissing C and Assembly while relying solely on 4G languages (some of which were written in C, and all of which eventually devolve into assembly code).

                  The idea is to use the best tool for the job - and sometimes, that means heading down the stack, sometimes up.

                  1. Anonymous Coward
                    Anonymous Coward

                    Good analogy, but saying NFS and iSCSI can't do performance is short sighted. Yes, there is CPU overhead associated with NFS and SW initiators take CPU cycles, but that is minuscule compared to what is now available with 10+ cores per socket. If you are not a fan of using IP for storage, then that's a personal choice.

                    All that aside FCP and iSCSI both are ways to emulate SCSI over FC or Ethernet. What cracks me up is FCoE, replacing layers 0 and 1 with Ethernet. It makes sense if you are trying to standardized on Ethernet fabric and don't want to spend on new SAN, but not for net new deployments.

                    1. Anonymous Coward
                      Anonymous Coward

                      Not just CPU cycles... a significant chunk of network bandwidth too. I don't have a problem with IP - it's another transport - but there's too much bloat in the upper-layer protocols vs. raw block to excite me about either one in terms of performance.

                      FCoE? Uh, yeah... ok for legacy and transition cases, but I'd agree that they're rather silly for anything else.

  6. finnic

    Just wondering

    Chris, (assume I'm that blind man toughing one part of the elephant) I see a proprietary flashDrive base card. that has raw flash, NVMe, a circuit board with other components, a heat sink and no flash transition software/firmware on the board.

    Isn't/wasn't this one of the logical next steps to get rid of the old hard drive rules? Wasn't someone going to whip up a board like this? Couldn't something like this be created as a NVMe generic flash drive?

    I understand Pure's board is proprietary, but wouldn't/couldn't a generic version become a flash standard available for everyone to use and gain high volume economies plus speed and parallelism. Doesn't it make sense to scrap the existing flash drive designs and switch to something that allows each company's software talk to directly to the flash via NVMe?

    1. sivant

      Re: Just wondering

      There is such a thing - it's called "Open-Channel SSD", and it's the attempt to standardize this.

  7. sivant

    On the space efficiency issue

    After someone corrected the mistake in the original article (why didn't Pure Storage bother to make him correct it in the original text?), it turns out that Pure translates 366TB raw flash to 1PB effective capacity. Assuming the data reduction they use for that calculation is 5:1, the physical capacity they produce from the 366TB raw flash is 200TB, which means space efficiency of 55%. Indeed not the best (though not the worst either).

    One thing should be noted, they use raw NAND flash for that calculation. Flash array vendors that use SSDs (as opposed to flash modules) use the SSD capacity as their "raw capacity", ignoring the over provisioning of nand flash inside the SSD. But someone has to pay for that extra flash storage!

    If all AFA vendors would include the SSD over-provisioning in their space efficiency calculation, Pure's would look better. FWIW, the claim they make that DirectFlash saves flash over-provisionoing is technically sound.

    The bottom line that counts is the price (or TCO) per TB, and the effective/raw ratio is only part of that deal.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like