I think you just use locality sensitive hashing to select weight vectors for your deep learning neural net. Maybe the network uses a few million weights at a time out of say a trillion. I don't think the hardware to put say 1024 16 Gbyte flash chips in parallel is so difficult. That would pull between 0.5 and 5 kW. The technical question is how to avoid clustering in lower dimensions (so that the data is spread out evenly over the memory cells) while allowing clustering in higher dimensions. That's pretty easy to deal with. I guess the x-point chips would give you a 10x speed boost if you were to go down that road.
Intel's XPoint marketing is such frenetic, hype-filled BS that it is setting up the world to be utterly underwhelmed by the reality. We have had a mini deluge of XPoint memory chip news recently, with Frank Hady – Intel Fellow and Chief 3D XPoint Storage Architect – giving a pitch at the 7th Annual Non-Volatile Memories …
Monday 18th April 2016 22:03 GMT mundiax
Hi SeanC4S, I'm thrilled to finally see someone talking about the software consequences of this tech. Here's my question: what kind of specs do you think it would take for a convolutional net to be done realtime instead of in batches? If 3dxp converges server architecture to a real storage class memory (mind you that this may take 3-5 years or more), how many terabytes would it take per core to actually allow all of this to happen in memory? Furthermore, if storage class memory becomes a reality, is there a need for batch operations (a.la. standard Hadoop) in distributed jobs at all or can everything become streaming (a.la. Spark, Storm, etc...) since we could just put all the files into shared data structures and keep them there forever?
Monday 12th September 2016 23:26 GMT Anonymous Coward
Benefit of XPoint for Neural Networks
What does SeanC4S mean by "allow clustering in higher dimensions" and why would you want to do that?
During neural network training, the improved endurance of XPoint vs NAND Flash is useful. Once the neural network is trained, it seems to me that the weights could be streamed in sequentially from NAND Flash. I don't understand how there would be a "10x speed boost" from XPoint.
Saturday 16th April 2016 06:25 GMT Anonymous Coward
Is the author just trolling?
I have serious problems with this article, which honestly just comes across as a petulant, whining hatchet job against Intel and X-Point. What's the matter, Chris Mellor, did someone from Intel p*ss in your cornflakes? Or is it just that they haven't given you an Optane drive to review yet?
This is one of the nastiest, most bitter articles I've seen on the Register. I try to always assume the best of people, but in this case I do wonder if there's an agenda behind it.
Your main complaint seems to be "although I haven't personally tested it yet, I don't think version 1.0 of a radically-different new product will have the performance that the underlying technology is capable of". So? That's news?
Look, there are mature, intelligent ways to point that out. Phrases like, oh I don't know, "Don't get too excited about X-Point yet, because *we believe* the initial drives will not have the performance Intel claims" would be a perfectly reasonable way to express your opinion. And your readers could argue for or against that viewpoint, like grown-ups.
Unpleasant references to "soiled diapers" and "shiny brown and creamy BS" just lower the tone and make you sound like a 12-year-old obsessed with bodily functions.
I've read your past articles. You can do better. It ill behoves you to write stuff like this.
Saturday 16th April 2016 07:42 GMT DougS
Re: Is the author just trolling?
Your unpleasant references to "pissing in cornflakes" and claims that the author is petulant, whining, nasty, bitter and 12 years old doesn't show you as any more mature than he.
While he could have done a better job of making his case (it read like a Charlie Demerjian article over at semiaccurate) his point is quite correct. The hype around "1000x" better has led a lot of people to assume that XPoint will become a new tier beyond flash, but the actual difference appears to be much smaller.
The numbers in the article look pretty much identical to SLC NAND performance, so unless it can beat SLC in cost it may not have any impact on the market. Sure, that's gen 1, and it can be further improved, but NAND isn't standing still either. Unless future generations of XPoint prove suitable for 3D it won't be able to compete with NAND in either density or price.
Saturday 16th April 2016 08:32 GMT bri
Re: Is the author just trolling? @DougS
Well, in my book "10x faster" is a different term from "identical". Especially in 1st gen tech.
Although I do share the broad view of the article about somewhat misleading overstatements , I too find the tone and indeed the contrast between presented facts and conclusions of the article as jarring.
Saturday 16th April 2016 08:55 GMT Roo
Saturday 16th April 2016 07:50 GMT Novex
I started to get worried when I read the Tom's Hardware article about the presentation - which mentioned that the Optane memory was in an SSD being tested over Thunderbolt 3. I thought, "Why do a presentation in such a constrained setup. Surely a proper motherboard-based test would show off its capabilities better?" Then the penny dropped that they might be trying to hide something.
We'll have to wait for real product to know for sure, but my hopes for this new memory has been significantly lowered.
Saturday 16th April 2016 08:13 GMT Steve Medway
Not convinced by the article, Not convinced by intel either. 2nd Gen X-Point in a DIMM slot format would be around x1000 faster than current storage.
Does the author not realise that X-Point on all current storage busses don't have the bandwidth or low enough latency for it to perform to it's promise? Attach it directly to the memory bus and things could be golden.
That's marketing BS for you. Intel engineers no doubt hate the Marketers, they're probably using numbers for a product still under NDA.
Sunday 1st May 2016 14:24 GMT Alan Brown
"2nd Gen X-Point in a DIMM slot format would be around x1000 faster than current storage."
Would it be 1000x faster than existing NAND-in-DIMM-slot formats?
Comparing apples to mango juice is par for the course in IT. Sometimes it takes an article like this to point out the real differences.
Flash got much _much_ slower as it got smaller (that's an inevitable byproduct of smaller cell sizes), but 3d flash allows cells to grow again whilst still being able to achieve required packing densities. Up to now there's been no hard push for lower latencies in NAND (50k IOPS is more than enough for most purposes) but Xpoint means that manufacturers will probably start paying attention to this aspect.
The elephant in the room is DRAM latencies. These are not much better than they were 20 years ago (less than an order of magnitude), with all advances being in predictive fetches and wider access busses. Until DRAM gets down from 50-60us for random memory access (what do you think all those wait states really mean?) to less than 1ns, we're not going to see much improvment wrought by faster storage-memory. If you _really_ want to see GHz computing (vs marketing hype) then memory latencies need to be down under 100ns.
Saturday 16th April 2016 08:27 GMT Ian Emery
Figures from the original development pitch??
This products expected speed probably WAS 1,000 times faster than the industry average when they started developing it.
This weeks Hyper expensive component is next weeks mid range star, and next months bargain bucket remainder.
I miss Paris, I havent seen a new leaked nude video/photo in days.
Saturday 16th April 2016 09:30 GMT Bronek Kozicki
Saturday 16th April 2016 09:49 GMT Bronek Kozicki
Looking into this further, I am very confused what is this Intel PC3700 drive the author speaks about. It does not help that I used the wrong name myself, it should have been "DC P3700". Anyway, looking at the page 10 of the PDF linked above, the IOC of this drive is between 75k - 150k (in the worse case, i.e. pure write or 8k read/write, actual number depends on capacity). This is way above 15k number stated in the article.
Saturday 16th April 2016 09:48 GMT Roo
Saturday 16th April 2016 09:55 GMT Anonymous Coward
Dude. You apparently have little experience, or none, with actually using SSDs or storage devices. 78,000 IOPS at A QUEUE DEPTH OF ONE is insane. If it scales, even horribly, under heavy load it is going to be absolutely amazing. You do understand that they spec storage devices at QD256, right? this is QD1. Faster than a SATA SSD under (literally) the lightest possible load that you can assign to the device.
Saturday 16th April 2016 10:48 GMT Trollslayer
Saturday 16th April 2016 12:28 GMT Bronek Kozicki
one last point
There's been so much speculation about what XPoint actually is. Well, it might be Cross-Point memory, pretty well documented few years ago - here . Unity Semiductor where this research has been conducted was acquired by Rambus in 2012 and, one year later, Micron and Rambus signed agreement giving Micron access to all Rambus patents (which would include Cross-Point IP), details here. The wording used ("... granted to Micron and its subsidiaries") would also explain why XPoint venture is majority-owned by Micron.
Saturday 16th April 2016 12:43 GMT G2
NVIDIA BS reloaded
i was having a déjà vu moment when reading this article.... did Intel hire NVIDIA's BullShittin^H^H^H^H PR department ?
also, this article is reminding me soooo much of Charlie Demerjian's style from the good old INQ days - currently he's writing for semiaccurate but his best articles there are paywalled.. booo :(
Saturday 16th April 2016 16:35 GMT Anonymous Coward
The measurements of IOPS and latency are beholden to the controller. For instance, two SSDs with the same NAND can feature wildly different performance numbers if they have different SSD controllers. The controller is the limiting factor with a storage device, not the media, in ALL flash and memory-based storage products.
3D XPoint can program and erase each bit separately, which is not possible with a NAND based SSD. This also which means there is much more computational overhead (exponential) associated with the process. The fact that they are on the first generation of controllers designed for 3D XPoint means that the controller is even more limiting in this case.
A very bright storage engineer opined that if we took today's controllers and slapped them on yesterday's flash we would be astounded with the performance from older flash lithographies because they are faster and more endurant. The media technology, however, is moving so much faster than the controller technology.
The point of this ranting? The performance comparisons of an end device are pointless, they are not an indication of the performance of the underlying media.
Sunday 17th April 2016 01:59 GMT jmbnyc
one potential over statement generates complete nonsense
Your commentary and analysis are a bit mind boggling. If Intel failed to deliver 1000X but ended up at 10X or greater on latency and IOPS I would argue that I'd take it every day of the week with a big smile on my face.
In modern systems with CPUs galore and 100Gb NICS around the corner, the storage latency and throughput are still huge bottlenecks that cause developers to waste time trying to avoid and/or skip at the expense of fault tolerance. A 10X reduction in latency on storage via NVMe is hugely welcome and will allow great programmers to push the envelope of what systems can accomplish.
I don't think you should be bashing the numbers but instead you should be complaining about delivery dates which seemed to be pushed back and never definitive. I, like many fellow developers, wanted this stuff the day it was announced and waiting over a year to get something is worth complaining about. >= 10X on NVMe is surely not something to complain about regardless if Intel said 1000X. Lastly, I would bet the numbers of XPoint DIMMs will far exceed the NVMe numbers.
Wednesday 20th April 2016 08:33 GMT Anonymous Coward
Re: one potential over statement generates complete nonsense
The first 100Gb NICs are already here and the first samples actually arrived at the end of 2014. Mellanox has been shipping them for a while with support for both 100Gb Ethernet and EDR InfiniBand. Other vendors also have announced (and possibly already sampled) 100Gb/s adapters.
(Disclosure: I work for Mellanox.)
Sunday 17th April 2016 18:22 GMT Howard Hanek
Companies Don't Think Only About 'Faster'
Users should not expect that the equipment their employer recently purchased will be replaced any faster no matter how much 'better' ANY new technology is. All the frenetic marketing will accomplish little over the short term and will probably just bring Intel's results down.
Monday 18th April 2016 15:03 GMT returnofthemus
Dell buying EMC the best thing to happen to Chris Mellor
I've watched so many industry punduits going absolute gaga over this, ever since it was annouced, somewhat similar to Racetrack Memory and all it has done is raised more questions than answers, as with all annoucements of this type we've seen nothing apart from PowerPoints and that's when you know how desperate they have become.
Great work on exposing this phoney technology and keep up the good work ;-)
Monday 18th April 2016 17:19 GMT Rob Isrob
Roj is spot-on
I had a long winded reply but realized, why do all the digging. What Intel did was a bit disingenious, if you go back to the link Chris mentions they are comparing XPoint to flash. No doubt comparing XPoint dimms to "flash" SSD. it should have been comparing 250 ns latency to 80 microsecond latency, which is a 300x or so factor speed-up. But if you look hither and yon, you see SSDs that deliver read streams at 250 microsecond , there is your 1000x speed-up. Marketing (in my opinion) should have spoke about a 300x speed-up and explicitly mentioned they are comparing XPoint DIMMs to SSD flash.
Thursday 21st April 2016 00:14 GMT Nimboli
Dig a little deeper
Disclaimer: I am CTO at a storage vendor, but have no direct interest in Intel.
The 1000x improvement over flash performance may be valid for raw latency. Read latency of raw flash is about 50us. That of raw XPoint might be about 0.1us. That is a 500x improvement---close enough for Marketing purposes. The reported latency of 7us is likely due to PCI overhead and the current controller and might be avoidable in DIMM form factor.
Re IOPS: note that the reported IOPS of 78,500 is for queue depth of 1 (i.e., with only one IO pending at a time). That is quite good. In general the IOPS reported for flash SSDs (often over 100,000 IOPS) is at high queue depth. Robin Harris gets it in his blog: http://www.zdnet.com/article/how-intels-3d-xpoint-will-change-servers-and-storage/ .
It is plausible that XPoint has the most advantage over flash for low queue depth applications and in DIMM form factor, and that that advantage dimishes at high queue depth.
Thursday 21st April 2016 14:04 GMT Rob Isrob
Re: Dig a little deeper
"The reported latency of 7us is likely due to PCI overhead and the current controller and might be avoidable in DIMM form factor."
See SFD9 Intel presentation referenced below. XPoint media contributes 1us, PCI (the rest) which would be 7 or 8.
"Re IOPS: note that the reported IOPS of 78,500 is for queue depth of 1 "
Hard telling how that came about. But if you peruse the SFD9 presentation, you can see where the presenter shows 96800 IO/sec with a queue depth of 1: https://vimeo.com/159589810
"It is plausible that XPoint has the most advantage over flash for low queue depth applications and in DIMM form factor, and that that advantage dimishes at high queue depth."
That's right. Elsewhere in that presentation he speaks to that, doesn't pay to go beyond 8 or something like that. You can see in the demo a reference where they are doing nearly 160K 70/30 random read/write IOPS (iirc). They must have cranked the queues all the way to 8... So I'm not sure what your point is of diminishing advantage. Are you envisioning architectural design issues?