39 posts • joined 9 Jul 2011
Is this The Register or The Onion?
Businesses using disk drives in 2018?
This article was supposed to come out Tuesday wasn't it?
The all-flash version “can achieve ......
only 150K IOPs in other words its a dog.
They think they are so clever pretending they can hit a measly 1M IOPs, And that it only deserves a footnote that it took 8 units to get there.
This habit of referring to the performance of a half a rack or more worth of equipment as if it were a single product is actually just pathetic.
Didn't they just say how reliable their drives were.....
"Ask SolidFire about the failure rates of consumer grade MLC flash memory and why they are not using eMLC for enterprise SSD resilience. XtremIO’s use of enterprise grade eMLC flash provides a major advantage in flash reliability and superior endurance from XtremIO write abatement techniques."
Guess drive failures are only bad when they happen to someone else?
Oh and they have RAID-6 but only protection against a single drive failure at GA (how you do that I can't imagine) and they have no spares, so when you go to service the failed drive that means you better pull the right one of the 25 SSDs in that UNPROTECTED array, or.... poof goes the array.
This is a GA product?
Right and there are no flash economies of scale?.... oh wait what is that other chip in the cell phone... its a NAND flash chip. A 64 Gb per die NAND flash chip... that Micron PCM chip next to it?... 128Mb, 50X smaller, it might get better, it should catch DRAM but its not going to catch NAND, different cell sizes. So kindly take your unsupportable FUD and sling it somewhere else.
And it would be horrible to build something really awesome today when random people are claiming that in a time frame greater than the replacement schedule for the product that maybe something better will come along, and it will obviously be so much better, but apparently if you optimized for one kind of memory technology none of that effort would be of any use in any other memory technology, assuming of course that out of nowhere this other technology instantly replaced NAND, which of course it isn't going to.
The bigger problem with FUD slinging like this is that if you actually looked at even the public semi roadmaps you would know your vision of the future isn't, but you just want to scare people into not buying something better now. Not impressed.
I wonder if you just don't understand how much more life flash has to it, or how far from real world use memristors or PCM are... or if you know and just don't care because work in the marketing department of HP?
I mean HP has given up sliding the memristor availability date forward to always be "some time next year" and have now moved it several years out. Which basically says they have no clue at all when or even if it will be commercially viable.
As far as using any of these things as if they were "conventional memory", that's gona happen for a count on your hand number of applications, there is a difference between using memory to build storage, and using storage to build memory. I understand the later idea is getting lots of buz that doesn't make it a good idea.
That's crazy talk!!!
using SSDs I mean. Now storing all the metadata on a nice flash array, that would make some sense.
While we are on the subject of woven fabrics, holes and magnetic storage...
Let us not forget the magnetic core memory
Also Chris was not wrong in saying the RAMAC had one head per platter, it just wasn't the RAMAC 350 (pictured) but the 353
which as you can see here does have one arm per platter
Chris just needs to use the correct image, his statement is correct.
What a load of........
Seriously I wish companies would at least make claims that were even remotely realistic. The reason some flash over PCIe product has a read latency of 100us is.... well.... because... you know.... it takes 99us to perform the actual read from flash, going across that PCIe link adds a whopping 1us of extra latency. This TeraDIMM thing could be attached to an infinitely fast memory bus, it would only reduce the latency from 100us to 99us.
It can't be that the whole of the flash space is just memory mapped, the TLBs in almost all servers aren't designed for that kind of physical address space, and the DRAM controller will expect responses to all requests on fixed timing, there is no way for a DIMM to go "excuse me can I get back to you, that address you asked for is not in my cache".
And then there is the added power consumption on the DIMM slots that the servers were never designed for. And there is no protocol to tell a DIMM that the power is going out, or to prevent a bit of bad software scribbling all over your "storage".
A graph of latency under mixed read write load, to a wider range of addresses than the 2 or 3 cached addressed probably used to make that latency graph might also be informative.
At the end of the story they admit the flash has to be accessed through a driver stack or as a swap device through the OS so again the latency is going to no better than other PCIe flash devices. In all likelihood the real performance will be worse than other PCIe card flash storage since the power and physical space constraints are going to limit the kind of processing the flash controller can perform to be a lot less than what a controller sitting in a 20W PCIe slot can do.
I know the idea seems cool, but really these TeraDIMMs are TeraDUMM.
I'm sorry did you say something?
Marketing people who post as AC should at least be sure they get it right
Did someone say managed as a single entity?
Yes we let you manage the orchard rather than the individual trees.
Obviously some new use of the word "array" I was not previously aware of..
Comparing a 100 node config to single node configs of the competition is only "apples to apples" in the way comparing an apple orchard to a basket is "apples to apples".
100 1U nodes is two and a half full racks, the same number number of U worth of Violin arrays would have 30 Million IOPs of performance, four times Solidfire's 7.5M.
How do you like them apples, eh?
Re: And the write performance of the XS1715 is?
Not even mentioned for the XS1715
Not even sequential write numbers for the XS1715
Going to guess that means they are so horrific that they would rather leave us guessing than even just quote sequential write bandwidth much less random IOPs for the XS1715
Mixed performance numbers? Yeah I'm guessing those might not be so great either for the XS1715
There I fixed my post... happy now?
And the write performance is?
Not even mentioned.
Not even sequential write numbers?
Going to guess that means they are so horrific that they would rather leave us guessing than even just quote sequential write bandwidth much less random IOPs.
Mixed performance numbers? Yeah I'm guessing those might not be so great either
So wrong I don't even know where to begin.
Frumious' explanation is so wrong in so many ways. You don't have to read in the rest of the page and write it back to change one user data block's portion of the page, you write the changed user block along with other changed user data to a new page. No one (I hope) actually rewrites whole flash blocks or pages just to change one piece of user data any more, maybe 10 years ago they did.
And all log structured systems do is trade garbage collection performed by the SSD for log cleaning performed by the log file system. If write amplification is defined by how many times the flash is written divided by the number of user write operations, it doesn't matter that the log system writes sequentially, it still will rewrite the flash multiple times
It also doesn't look like Frumious really knows how log systems work, (FYI, I had my grad OS course back in the day from one of the BSD LFS authors, so i have a little bit of a clue here) And the last two paragraphs seem like just so much gibberish I almost wonder if it was produced by some automated buzzword generation algorithm.
Oh and as for a 7 year guarantee, what business is going to keep around a system which was fully depreciated and which can be replaced by a new system taking up 100 times less space and power? Think about it, with density basically doubling each year in 7 years you will store in 1U what today takes 2 full racks.
Re: Paper IS covered by ITAR
Not just paper but fabric as well.
and human skin
of course once you successfully export a copy and convert it to electronic form you are ok, then you can link to copies outside the US like this.
so yes ITAR can be a real pain
These aren't stacked ICs, they are single ICs fabricated to have 3D structure, and I am quite sure they know all about the heat and any other like issue.
Re: Just a question, but
Hey at least they didn't title the story
IBM POURS $1 THOUSAND MEELLION INTO FLASH SSDS
I think someone meant to do that.
Chris, I'll make you a bet, the packets weren't really "between 99,971 and 99,985 bytes long", they just had header fields saying they were, they sort of say as much when they say no packet should have matched the rule because no packets were actually that long, and that range of lengths was picked because the attacker knew a rule blocking them would crash the routers badly.
Its fascinating watching people make pronouncements about what technology will look like in 10 years. Saying flash *will* be dead in the next 5 to 10 years but that spinning rust will hang around for longer is not really a statement that holds up.
Sure, 2D floating gate NAND flash will be dead in a few years, replaced by charge trap NAND, and/or 3D NAND, but it will still look like flash as far as you can tell. At some point 3D NAND gets replaced but ReRAM or some other technology, but it will still look like flash as far as you can tell. At some point ReRAM or whatnot gets replaced by yet another technology that hasn't even been chosen yet, because its so far in the future, but it will probably still look like flash as far as you can tell.
Now as far as spinning rust goes, sure it will be around but its not going to look like the disk drive you are used to. HDDs are running into their own scaling issues, and once they go to shingled magnetic recording (SMR) they turn into something like a cross between flash and a file system. This is not the disk drive you are used to. If anything is going to be *dead* in the data center its the hard drive, too slow, too unreliable, too much power consumption.
As far as flash pricing goes when buying tier-1 storage you are buying performance not GB, if $/GB was all that mattered no one would buy anything but tape, or maybe slow SATA drives. The reason people buy 15K SAS drives is because they understand they are buying performance not bulk storage. And that is why people flash. Even for what some might call tier-2 storage used for VDI and other apps, people buy flash because disks are just too slow.
Enterprise storage is so much more than $/GB, disk storage could be free, but that doesn't matter if it takes more space than you have, more power than you have, more cooling than you have, can't provide performance for the applications that you have, or more importantly the applications you wish you had if only your storage was fast enough to run them.
As far as tiering goes (no I wasn't going to forget about tiering), while I could dispute the cost differentials of storage *systems* as opposed to just looking at component prices, I will just observe that I think saying it will never happen that applications will manage their data is pretty clearly not going to be the case, toss in app aware/guided file systems and/or hypervisors and I think that would definitely not be a safe prediction.
And I'm sure such tiering systems will come in handy for moving your data between your performance SLC storage, your bulk MLC storage, between your bulk MLC storage and your TLC archive/backup storage. Standard mechanisms will probably suffice to move your data to your long term SATA archive storage.
Don't believe that tiering will wind up being between memory types rather than between memory and disk. That's ok, people scoffed at the idea of disk to disk instead of disk to tape backups too.
Moon Base Alpha come back in time through a worm hole in space.........
Rebecca, the problem is that the information presented from Gartner just doesn't hold up, SEMI reports that something like $7B US is being spent this year on flash fabs and close to $9B US is planed to be spent next year. So clearly there is a lot of money being spent on flash fab production.
As to how long it takes if they start tomorrow. In April Toshiba said they planed to start construction of Phase 2 of Fab 5 and have it online and in production in mid 2013, they have since pushed that back because they don't need to bring on that much capacity yet. It seems that they believe that they can go from a cleared site to wafers rolling off the line in about 12 months. If some analyst says it takes 5 years to build a fab, and a company that actually builds and runs fabs and who has previously gotten one built and running in a year says it will take them a year from start to end to do the next one, who are you going to believe?
Its simple the Gartner analyst is wrong, he says it takes 5 years, it has been shown it can be done in 1 year, thus there is no reason to believe he knows what he is talking about when it comes to the fab business.
Dear spectacularly refined chap,
No the 12 month build isn't blown out of the water, perhaps it takes 5 years to decide you need a fab, budget for it, design it, architect the structure, clear the site, get the permits, order the machinery, clear the site, train the workers, build the facility, install the machinery, start production.
But its is clear that the "build the building to start production" part can be done in 1 year, they did it. You don't know if for instance they already have all the stepper machines and what not on order for Phase 2 of Fab 5. And since the second part of Fab 5 is a duplicate of the first half and the site is already cleared it does not seem like there is any activity that is externally visible that would prevent them from starting construction tomorrow and being ready to sell flash a year later.
Given they have talked about starting construction next year one assumes they must have things like steppers and the like lined up for potential delivery. In which case they will start building a duplicate of the building they just build and a year later it will be online making parts using equipment ordered 1 year before the started on the building.
Its important to distinguish between "it takes X years to build" and "it takes X years to plan for and build the first one and Y < X years to build a new one after the first one works"
An additional point I'd like to make about the supposed fab '"run out", given that both Samsung and Toshiba are working on expanding capacity and between them are more than 1/2 the worlds supply of flash, even leaving out the other suppliers its hard to say the world fab capacity isn't going up.
Also Gartner has it wrong, it doesn't take 5 years to build a fab, Toshiba's Fab 5 (and this is all public information) was built and brought online in 1 year, and it is a modular design that can be doubled in size on the same site plan.
Also to clarify I said I think in 5 years the fabs will had decided which technology they are going to use as their post-post-NAND solution, it may be more than 5 years before they actually switch to it.
the drives might be affordable, if only you didn't need so many..
The point you are missing is that for a disk backup system to have as much throughput at a 3U flash array you are probably looking at anywhere from 1/2 to a whole rack of disks. So apart from the issue of the floor space consumed you will be buying so much more capacity than needed even if you buy small disks that I don't think it will turn out to be as cheap as you think it is.
Particularly since the idea is just to keep the backups you are likely to need to restore quickly, which would be last night, or the last time your did a full backup of your database, etc which shouldn't be that far back. So again totally over buying the capacity is going to make the disk system a lot less attractive.
Performance is a serious business but some people don't treat it as such.
Kaminario's Shachar Fienblit suggests that this is not a game and that I am mixing apples and oranges from multiple data points to produce a false conclusion, I have to disagree.
Kaminario is not just mixing different benchmarks they are mixing different products. Their SPC-1 result is not from the MLC K2 product (previously called the K2-F), but from the K2-D, an all DRAM product that is no longer listed as a product on their web site.
Both in the article, in their press releases and in their response in this thread Kaminario switches back and forth quoting both benchmark results and saying they come from the "K2" without any hint that there are two totally different products involved. Even the link on the SPC-1 results page which is labeled "Kaminario K2-D (1875K-1.1)" leads instead to the K2 Flash SSD product page.
It is hard to see how such phrasing as they use coupled with dropping the -D and -F suffixes from the product names and linking to a different product than was submitted for the SPC-1 benchmark could do anything but lead people to draw false technical conclusions of the benchmark performance of their product(s).
Hopefully I have made my point about Kaminario's benchmark claims sufficiently clear at this point which was the intent of my original response and so I will not dwell on the rest of Fielblit's response which strays from the technical into the realm of marketing.
However with regard to Fienblit's last statement, which clearly falls into the realm of the technical.
"Can customers trust Violin in replacing a VIMM while their live system is running a mission critical application?"
The answer is an emphatic 'yes you can!', as you can see demonstrated here.
Founder & CTO
You want to play these games? Fine, lets play.
Kaminario's claims described in this article are so ludicrous and in some cases blatantly false it is hard to know where to begin, lets start with the simplest.
The word "system", Kaminario claims their product is a "scale out" not "scale up", in which case when you offer a small product, that, is a single system, and when you offer a full rack product, that, is many systems, plural, if it is one system, then you are scale up not scale out. So Kaminario's results are not for a single system all the numbers they boast about are for a multi-system configuration. We will get into this and see that all their claims are backward, they trail rather than lead in all the configurations they have chosen to discuss.
Then there is the issue of latency..
> The gloating Kaminario also said the system's latency was less than 1 millisecond,
> claiming this was "four times better than the closest competitor".
They don't name the supposed "closest competitor", but I will volunteer myself in that role given that they make plenty of comparisons to our product later on. Now they don't say how much lower than 1ms their latency is, but if we look at the SPC-1 results we see the K-2 at 95% load with 3.7ms read latency, the only results less than 1ms are for the 10% Load Level Test Run where their latency is 420us. A Violin array under similar load percentages would have latencies of around ~800us and ~100us respectively and wouldn't have any of the >30ms latency spikes the K-2 displays. So It appears Kaminario meant to say they were "four times worse than their leading competitor".
It is worth noting that the K-2 SPC-1 benchmark configuration was an entire rack at $490,000 with a whopping 1.159TB of storage, that's a "." not a "," so that comes out to ~$300/GB by my calculation. I'm not going to discuss exactly how many times worse than their leading competitor this is, lets just say its a lot closer to 40X than 4X.
>"This benchmark reinforces the importance of a … scale-out architecture compared to the
> proprietary scale-up systems of TMS and Violin. [It] allows our customers to buy one K2
> and grow it as their needs grow. The equivalent benchmark by Violin required two SLC
> (faster single level cell flash) systems; we scaled to 2M IOPS and 20 GB/sec throughput
> with one MLC system," the firm responded.
Again, if you are a scale-out architecture then your rack is a collection of systems, its hard to say how many systems one should view the K-2 rack as being, 3?, 15?, 33?, 45? but a single system it is not. But that aside lets consider a rack's worth of 12 V6616's, and remembering that their "benchmark" is pure random read, so the rack of SLC 6616's gives 15M IOPs, 60GB/s throughput from 120TB of capacity. Or 7.5X the IOPs, 3X the throughput and 2X the capacity. If we consider a rack of 12 MLC 6632's we would see 12M IOPs, 48GB/s of throughput and 240TB of capacity or 6X the IOPs, 2.5X the throughput and 4X the capacity.
So all this benchmark shows is that in all possible ways the performance and capacity of the K-2 lags far behind the products of their leading competitor.
> If you are going to Oracle OpenWorld over the next few days (it starts today and runs
> until 4 October), you can see the K2 in action, presumably spitting through complex
> Oracle database applications as if they were trivial spreadsheets.
If you are going to OOW over the next few days certainly stop in and see a collection of K-2 systems in action giving 2M IOPs from a 40U rack, which is only 7X worse than their leading competitor, then walk 10 feet across the isle to said competitor and see 2 V6616s giving 2M IOPs from 6U. And read about how that configuration produced the most recent World Record TPC-C benchmark on Oracle, which will will assume is like spitting through complex Oracle database applications as if they were trivial middle school algebra homework.
Founder & CTO
Violin Memory Inc.
Actually the shredder isn't a sure thing
Read a very interesting paper about shredding flash chips, most shredders don't actually make small enough pieces to be sure of destroying the flash die on a PCB.
If you really want to destroy the data on an SSD several hours in an oven will do the trick, (make sure its not too high, don't want to set it on fire) after a while they will be good and scrambled.
The problem with the result isn't that DRAM systems shouldn't be allowed to compete with HDD systems, and at $423/GB I wouldn't really say this system was competing with HDD systems, the problem is that SPC clearly needs some sort of minimum capacity to OP count requirement similar to TPC. Without that, in the limit all you need to do is take a really powerful server with a few GB of DRAM, add a UPS and you could post up totally insane $/OP numbers, but they would be as meaningless as these Kaminario numbers are.
Re: Isn't this normal?
"Honestly I think that was overdoing it a bit, but apparently someone on the board of directors had a bug up his ass about security, so..."
Or he was a big Douglas Adams fan.
From their web site... in Radoslav's bio
"Radoslav was founder and CTO of SandForce Inc., the first and *currently* only company that enabled MLC flash memory in enterprise storage"
Apparently Michael had it right above, you can have full marks for honesty, or you can have the investment....
one more thing
I should add that TLC and MLC don't "become" SLC, they just can be used in an SLC-like manner, but MLC as SLC isn't as good as actual SLC, and actually using TLC as SLC would be a waste of money since the TLC chips are rather more complicated than SLC chips. So none of this eliminates the market for three kinds of chips. We use the "SLC mode" to play games and tweek things, but if you actually want real SLC performance and endurance then you should buy SLC.parts.
re: MLC can be used as SLC fairly easily
That's not how its done, what you are describing is not MLC used as SLC, what you described is just MLC used at half density..
There is no alchemy involved here at all, and there is nothing secret about it (well nothing secret within the industry), in fact as far as I can tell it is part of the way most vendors operate TLC flash, if wouldn't surprise me that many do it for MLC as well. But like I said its not a secret, how to do it is right there in the data sheets.
Re: What if?
>MS SQL Server does not scale well on TPC-C - You wont see one from them.
Actually 3 of the 10 most recent TPC-C submissions were with MS SQL.
I know because the most recent one used my boxes.
The "principle" of Hadoop is NOT that it runs on low cost commodity hardware, your statement is a classic example of ex post facto justification.
The "principle" of Hadoop is split up a problem too big for one machine to handle into a problem multiple machines can handle. The number and type of machines is supposed to be chosen to most cost effectively sole the problem.
The servers you would build a Hadoop cluster with today would not be the same servers you would have build it from 4 years ago. Why? Because technology changes.
So yes, it used to be that a wall of cheap servers with local disks was the best way to do that, but technology changes and it isn't the best way any more.
You can accept at this is true, or at least be willing to accept that it might be true and go have a look, or you can put your foot down and stand on "principle", its your choice.
I'm going to guess you will stand on "principle", that's the usual Luddite reaction to this sort of thing.
Herby beat me to it
I can't understand how this could have possibly made it through a standards process given it is almost identical to a known attack that I still remember from my OS class lo these many decades ago.
why there is always someone compelled to make some comment about flash wear out in any article about enterprise flash arrays.
For large MLC arrays even if your wrote at full speed 24/7/365 for the life of the product you couldn't wear them out. I mean do you actually know anyone who writes the entire content of their storage system every couple of hours every single day? No you don't know anyone who does that, and if you did, well then they can buy the SLC version and write to it for decades.
I can't tell if the people who post "oooooo but it will wear out" in pretty much every flash related article are just ignorant or are shills for legacy disk array vendors, either way, it got old a long time ago.
Flash write life
The flash trolls are getting predictable, and boring.
We are not talking about a single flash chip in your embedded system, these array have 1000's of die, and 1,000,000's of blocks, and are not bothered in the slightest by a few of them failing. No one rewrites a 40TB array once an hour, which is the sort of load you would need to be talking about to even approach actually wearing out a flash array.
If you actually wanted to know the answers to your questions you would be out reading interesting conference papers instead of posting here. So why don't you take your FUD and move along,
experienced this just the other day
Rented a movie from the Android market, pined it and downloaded it to my tablet, and then *poof* one day it was gone like it never was there. A movie I had watched that had the 24hour period on it still showed up as having 20 days remaining, but the movie I didn't watch, was gone. AND it was gone from my market purchase history too, because the rights owner had pulled it from the market place without warning it was taken away from me with no notice.
And here is the kicker, while they deleted even the record of my purchase from my market place account, I had to ask for a refund to get my money back.
So how many peoples money have they kept who didn't notice this, or don't realize they have to ask for a refund, particularly given that it says they don't give refunds and you have to go a few levels deep before you come to the place where you can find a button for a refund.
The movie was The Adjustment Bureau, which looked interesting, but now that they have said they don't want my money I'm sure as heck not going to buy it when they put it back on the market place.