back to article Storage with the speed of memory? XPoint, XPoint, that's our plan

Since the virtual dawn of computing, storage – where data puts its feet up when it's at home – has been massively slower than memory, where data puts on its trainers and goes for a quick run. That massive access speed gap has been getting narrower and narrower with each storage technology advance: paper tape, magnetic tape, …

  1. Mikel

    The future is vertical

    No more Moore in X and Y? Use Z.

    The converged SOC with processor, RAM, storage, networking, IO and everything else integrated into one low-power chip is fast approaching. And I couldn't be happier about that. It will be disruptive though.

    1. LaeMing
      Happy

      Re: The future is vertical

      If all the fast stuff is integrate on a SOC, with just low-pin-count for connecting to slow peripherals (USB, Network PHYs, etc.) I might even be able to go back to making my own computer mainboards!

  2. Mage Silver badge
    Facepalm

    Paradoxically, storage media that doesn’t move is faster ...

    How is it paradoxical?

    Moving discs, drums, tape can be faster for pure sequential writes, but for RANDOM seeks, random access, multi-threaded multiuser access of small files etc, it's inherently slower than any storage without moving parts..

    1. Daniel von Asmuth
      Facepalm

      Re: Paradoxically, storage media that doesn’t move is faster ...

      As Xeno pointed out, motion is paradoxical, so most computers run zero meters per second.

      For sequential access, magnetic tape can be faster than disc, but for random access it's the other way around. The author forgot optical discs, which have lousy seek times.

      1. Michael Wojcik Silver badge

        Re: Paradoxically, storage media that doesn’t move is faster ...

        As Xeno pointed out, motion is paradoxical

        No, it's not. The movement paradox is only a "paradox" until you understand that an infinite series can converge. Not that it's not a useful thought experiment, and good for those Introduction to Philosophy classes; but as a paradox it's a bit lacking.

        Even without that, the movement-paradox argument fails physically when you get to the Planck length, though obviously Xeno had no way of knowing that.

  3. Bronek Kozicki

    few points

    For one, 200ns DRAM latency is obsolete figure, this is currently regarded to be between 70ns - 130ns (or sometimes less than that)

    Current Linux version already support NVDIMM. Actual hardware is currently implemented with supercapacitors and onboard NAND storage, as explained here

    Related presentation but without this many details are here and here .

    Finally, for some speculation on what XPoint (or Cross-Point) actually is, let me point to this post of mine

    1. Chris Mellor 1

      Re: few points

      Great comment. Presentation decks are informative. CrossPoint origin note intriguing. DRAM access latency number being checked.

      Thank you Bronek,

      Chris.

      1. Anonymous Coward
        Anonymous Coward

        Re: few points

        You might also want to seek out some current SLC NAND numbers like a Micron 420h - 42 us read, 7 us write, which compares much more favorably with the SLC Optane Xpoint. So unless XPoint is cheaper to make than SLC NAND, it will have a harder time finding a way in. Already SLC is becoming marginalized because MLC is "fast enough" for most uses, so SLC is relegated to very special uses like real time or as a write cache in a MLC drive.

    2. Alan Brown Silver badge

      Re: few points

      "For one, 200ns DRAM latency is obsolete figure, this is currently regarded to be between 70ns - 130ns (or sometimes less than that)"

      Full random-access DRAM read is quite slow. (hence "wait states").

      You're correct that worst case latencies are down around 70-130ns, but for the kind of work I'm dealing with, ram is a far greater bottleneck than storage - "storage is slow? Add more ram!" - results in some systems having 2TB onboard - but these systems are spending most of their cpu cycles twiddling their thumbs waiting on DRAM responses.

      Yes, SSDs and Xpoint are a necessary step-change in storage speeds but memory needs a similar kick in the pants.

      From my point of view, there's not much point in faster CPUs or SCM or faster SSDs until this bottleneck is broken. the majority of compute load is spent on in-memory stuff, not on in-storage. SCM is nice but it's not going to result in computing being 10 times faster.

  4. Novex

    SCM for Consumers?

    In my understanding, XPoint seems pretty much redundant in SSD form when connected via SATA, and is under utilized when on PCIe. With the best use case being on a DIMM slot on the memory bus, I wonder if consumer-intended enthusiast-class motherboards might get XPoint DIMM slots on them... (all I've heard re SCM is that it'll be on server boards, not consumer boards).

  5. Duncan Macdonald

    NVDIMM

    DRAM DIMMs with onboard NAND backup are already available - and they have the speed of DRAM. (The NAND is only used to take a copy of the DRAM data if the main power fails.)

    For XPoint to succeed it must have much higher density than NVDIMM that beats it for speed.

    1. Michael H.F. Wilkinson Silver badge

      Re: NVDIMM

      If 3D XPOINT is a lot cheaper than the DRAM + NAND backup solution it does become interesting, even if its speed isn't quite as high. Not sure it is, however.

  6. edward wright

    Orders of magnitude

    ... are powers of 10, so 10ms is over 5 orders of magnitude slower than 200ns, not 2.

  7. Anonymous Coward
    Anonymous Coward

    That table again...

    Where did that table come from, as its not accurate at all. Did it get plucked out of someones arse?

    Disk seek, 10ms, is it some disk from the 80s, SEEK times are 2-4ms in general on drives (rotational).

    DAS access, 100ms!! is the DAS in another country? Same with the SAN, 200ms for SAN access, is it located on the other side of the world to the server?

    1. Chris Mellor 1

      Re: That table again...

      Hi Anonymous C .... The table is a much modified version of a source table some one gave me.I would dearly love more accurate numbers.

      Re disk seek;. a Toshiba L200 (2.5-inch 5,200rpm) has a 5.56ms seek time so a 10ms seek time as an indicative number for disks doesn't seem that far out. My understanding is that 15,000rpm drives could have 4-5ms seeks, desktop 3.5-inch drives have 9-11ms seek times, and sluggish mobile 2.5-inchers could be as slow as 12ms. Do you have better numbers?

      What would you say would be a better number for DAS access, assuming, say, a 10K rpm SATA drive?

      Ditto SAN access?

      Cheers .... Chris

      1. Pascal

        Re: That table again...

        Enterprise drives will have a 2-4 ms seek time depending on size and rotational speed. You basically won't find anything that's not at least 2 to 3 times faster than 10 ms on and Enterprise-class drive (that's sold for speed - those slow "backup" drives are another story).

        But where I agree that the numbers are waaaay off is DAS / SAN. You're looking at those same 2-4 ms drive, with an I/O subsystems that will add *microseconds* worth of latency.

        You can easily have a DAS subsystem that's filled with spinning rust (say, 15k rpm, 2.5 inchers) that will have an average access time below 5 ms.

        1. GrumpyOF

          Re: That table again...

          Seek time has nothing to do with rotational speed as it is a measurement of time taken to move the read/write heads from one track to another and average seek time is approximately the time taken to move over one third of the tracks on a drive.

          Access time, on the other hand is a much more realistic and sensible metric in that it should encompass the mechanical and software related delays in the IO process and is applicable to ALL types of storage i.e . finding a tape in a tape librayry, waiting for an available drive and loading the tape and searching to the correct location for data on the tape......is a LOOOng access time whereas solid state devices are really quick.

          Rotational speed is much more about data transfer rates, once the heads are over the required track, how quickly can it transfer data to or from the track.

          1. Pascal

            Re: That table again...

            > Seek time has nothing to do with rotational speed as it is a measurement of time taken to move the read/write heads from one track to another and average seek time is approximately the time taken to move over one third of the tracks on a drive.

            Seek times are generally listed to include stroke time / settle time as well as "waiting for the beginning of the track to reach the head once the head is in place". Basically "how long before you can actually read the next thing 1/3 of the drive away". So rotational speed does have an effect on average seek time - although clearly not a huge one.

      2. Anonymous Coward
        Anonymous Coward

        Re: That table again...

        "I would dearly love more accurate numbers."

        seek times for drives, these will be performance drives as this table is comparing a new performance storage device to current devices.

        An enterprise performance sas drive, including controller overhead, 10k rpm, as no one will really buy 15k anymore.

        read seek:

        average: 4.2ms

        single track: 0.4ms

        full stroke: 7.68ms

        Write seek:

        average: 4.6ms

        single track: 0.7ms

        full stroke: 8.15ms

        rotational latency: 3ms

        or if SSD is considered a drive: ~0.1ms

        DAS:

        Take those same drives, if not taking into account the controller which will no doubt have cache on board for read and writes, add effectively nothing, with cache, reduce it.

        SAN:

        Same as DAS, if not taking into account of write caching, for spinning, add nothing, SSD is slightly different, can introduce a little more latency. Switching latency for FC is in nanoseconds, for example a brocade 300 will have a maximum latency of 700ns.

        Example of performance for a SAN, 3PAR 7200 using 8gb FC, with a flash tier will give around 0.2ms read/write latency, write latency will be lower than read as they will go to cache first, same with 10k drives, write latency will be in the region of 0.1-0.2ms with reads around 5-8ms, and for NL 12-15ms.

        Latency will change of course depending upon load, increasing as more requests come in at the same time.

    2. Alan Brown Silver badge

      Re: That table again...

      > Disk seek, 10ms, is it some disk from the 80s, SEEK times are 2-4ms in general on drives (rotational).

      Spinning disk seek times are (typically) 1-2ms for sequential access, 10-14ms for random access, even with 7200rpm disks. Noone buys 10-20krpm drives anymore. They're so expensive that SSDs make more sense and in any case the random access latency was only reduced by 1-2ms at best (arm swing rates being the majority factor, not rotational speed)

      > DAS access, 100ms!! is the DAS in another country?

      Xyratex FC2404 storage arrays are about this fast at full access speed - primarily due to an incredibly slow raid controller (Fibre connected)

      > Same with the SAN, 200ms for SAN access, is it located on the other side of the world to the server?

      That depends on the SAN. The redhat cluster storage system (GFS2) I just ditched had about 120ms access latency per file, thanks to all the nodes having to agree on access to any given block on the FC storage. ixSystems ZFS NAS which has replaced it is _much_ faster.

  8. jmbnyc

    Chris,

    Well I guess some of the comments on your last XPoint rant article (... Intel's XPoint emperor has no clothes, only soiled diapers) impacted you enough to write something a bit more objective and useful for you audience. I do high performance programming for a high living and thus XPoint (when it hits the commercial market) will be gladly welcomed even if the initial perf is 10X better than current top of line NVMe PCIe SSDs. When SCM XPoint Dimms arrive, us programmers can then begin to figure out having access to reasonably fast persistent memory changes how systems are designed. Perhaps we can begin to get away from the current trend of complex quorum schemes.

  9. Steve Chalmers

    Real story is changing relationship between application code and storage

    Chris,

    Thank you for a very good analysis piece, which you had to put together from not enough public data. This kind of analysis has to use rough numbers, as you did. The comments of those who want everything accurate to two significant digits are valid, but Xpoint (and its competitors) are still evolving.

    jmbnyc in his comment raises an important issue which I think has been lost in the focus on Intel's excellent marketing of Xpoint. Byte addressable persistent main memory in computers enables a fundamental change in how applications (either directly or through databases, file systems, etc) cause a particular data item to become durable.

    It is applications which use some new and different approach to how they read and write persistent data, which will deliver the payoff. The way the persistent bits are stored is just an enabler.

    @FStevenChalmers

    (speaking for self, not employer)

  10. DrBandwidth

    Hopelessly inaccurate numbers.....

    Minor: As noted above, unloaded DRAM latencies in typical two-socket server systems are in the range of ~85 ns (local) to ~120 ns (remote). These have been increasing slightly over time as the number of cores increases and the core frequencies decrease. Latency under load can be much higher, but in such cases the throughput is typically more important than the latency.

    Major: It is a bit difficult to read the chart showing "Price per gigabyte" vs "bandwidth", but if I am interpreting the axes correctly, then the values are off by more than an order of magnitude. For DRAM, the chart shows "price per gigabyte" in the range of $30 to $400, centered somewhere between $100 and $200. This is ridiculous. DRAM wholesale chip costs are in the range of under $4/GiB (4Gib DDR4) to about $5 (8 Gib DDR4), leading to *retail* prices for registered ECC DIMMs in the range of $6/GB (for 16GiB and 32GiB DIMMs). It looks like the SSD pricing is off by almost as much...

    Major: What is the y-axis of that chart supposed to mean? "Bandwidth" per what? Price per gigabyte is a reasonably well-defined concept, but "bandwidth" by itself can be interpreted so many ways that it is almost meaningless. The chart shows DRAM "bandwidth" in the 1000 MB/s to 10,000 MB/s range. The low end of this range roughly matches the bandwidth of a single DDR4/2133 DRAM chip with a x4 output configuration, while the high end of the range is only a bit more than 1/2 of the bandwidth available from a DDR4/2133 DIMM.

    1. Chris Mellor 1

      Re: Hopelessly inaccurate numbers.....

      In reply to Dr Bandwidth,

      New table uploaded for your minor point. The major points relate to a chart created by Jim Handy of Objective Analysis (http://objective-analysis.com) and meant, in my understanding, as an indicative general table showing the relative values of various memory and storage media in the 2-dimensional space defined by the two axes. I'd suggest you take up the detail questions you have with Jim.Handy (at) Objective-Analysis.com.

      Cheers ... Chris.

  11. batfastad

    Your SAN...

    ... sucks if you're seeing 200ms latency!

  12. treynolds

    Storage at the speed of memory?

    A company called DataCore appears to have done just that. Their last SPC-1 run was, shall I say, more that slightly disruptive to the storage paradigm. What they did in 4U took 6 racks or more by the other guys. And if that wasn't enough, they achieved 100 microsecond latency at 100% load. I can only imagine what is coming next.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon