Fusion-io and Princeton University boffins have used a PCIe flash drive to virtually extend a server's main memory into the terabytes. Engineers at the storage company collaborated with the computer scientists to design the Extended Memory subsystem using Fusion's ioMemory hardware. Applications built with Fusion's software …
Might get a student to try something like that with the 320GB Fusio-IO cards in two of our compute servers.
“The ability to optimise key operating system subsystems for flash with tools such as Extended Memory simplifies performance for developers in ways that were out of reach just a couple of years ago,"
Really? Most OS-es of the past 30 years have the swap-to-disk ability built in. What exactly is there to simplify? It is already completely transparent. From what is described, this merely sounds like making big news out of having a 2.6TB PCIe connected SSD for having your swap file on. What is the big news here?
The performance is very different because we reorganize writes in a way that swap doesn't (and in some cases, can't).
Details at http://ssdalloc.cs.princeton.edu/technology.html
IS there any hardware vendor out there that offers this as a chipset? The concept of using memory behind PCI/PCIe isn't all that unique. I thought that's how device drivers were talking to memory on PCI/PCIe cards for configuration and setup. I might be wrong -- it's been years since I've done anything low level.
"The performance is very different because we reorganize writes in a way that swap doesn't (and in some cases, can't)."
A dumb swap disk can't sure, but a decent SSD (with plenty of DRAM cache) can and does. Any sanely designed modern SSD (OK, that may narrow the field down to a precious few, but that's not the point) will do all the writes sequentially for performance reasons (unless there is a pathological situation going on that prevents it, e.g. no spare unmapped blocks are available to do the writes sequentially - highly unlikely on a TRIM capable SSD with reasonably over-provisioned NAND).
Similar optimization can be applied in software, e.g. using Managed Flash (http://www.managedflash.com/index.htm) or on a file system level, e.g. nilfs (http://www.nilfs.org/en/).
I had a quick look through the paper and can't spot any like-for-like comparisons of normal swap vs fake RAM using the same backing device (comparing PCIe connected NAND high end to SATA connected consumer grade NAND is not a reasonable comparison). Can you provide a like-for-like comparison benchmark?
The currently published work was from 2011, and had benchmarks for what we could afford at the time, including the enterprise-grade SATA-connected NAND. It's being compared against swap, not against the consumer-grade stuff. All of those comparisons are like vs like. If you want it summarized, look at logical slide #21 in the slide deck. It shows the gains of the same device using our approach vs swap.
Is the patch for the Linux kernel available?
Hehe, Expanded Memory, please tell me that they had to edit CONFIG.SYS to set this up :)
Just another swap implementation, instead of swapping to disk (i.e. going through the disk subsystem) it swaps directly to another memory subsystem.
Faster than normal swap-to-disk? Any benchmarks?
Nice marketing blurp though.
Legitimate question I suppose, as I have no idea what kind of improvement bypassing "layers of software", as the author put it, can yield for computing at the 10 TB database scale. How much overhead can a kernel's virtual memory manager have ?
I don't think the OS's virtual memory management would be much slower, quite interested in seeing some benchmarks on identical systems using the same hardware just one setup with a large swap partition on that PCIe card and the other with their hocus pocus setup bypassing the kernel. My guess is marginal speed increase of 0.42%
Hold up guys, AC's guessed the secret. Fusion must be crazy-mad now!
This is not "swap on SSD on PCIe", its way more than that. It consists of multiple layers of caching, some DRAM, some SSD, moving data intelligently between the layers. If your application suits, the speed ups are massively impressive. If it doesn't, then they are just impressive.
Don't trust me, read the docs.
When a program tries to access an address in swapped out memory, it generates a page fault. The OS then has to locate an unused page, or a page that has not recently been used, move that out to disk, load the requested page from disk and insert it into the page map.
Under this system, the page fault is eliminated. When the program tries to access a page, it is not paged out, it is on the FusionMemory device, on DRAM or on disk. When most of the data you need to access is on DRAM, or can be intelligently put on DRAM by the FusionMemory device, you get massive speed ups.
I'll let you know how it goes
Just about to buy a couple of these for our HPC implementation
GIVE ME MORE POWER!!!!
Until it craps out
This sounds like the ideal way to wear your flash out by doing massive numbers of writes to it in a short time.
Re: Until it craps out
The approach actually generates _less_ random write traffic than using it as swap.
It also uses DRAM for hot pages, so it's not that every application write is being
written back to the NAND-flash.
The details are in the paper - http://ssdalloc.cs.princeton.edu/technology.html
GG FusionIO, straight to bankruptcy
Fusion IO is putting itself out of business, product by product, news by news.
Flash always was memory, every flash controller out there uses it exactly as memory, and fusionIO is striving to make everyone use flash without controller.
The good news is, noone will need them when it's done, they don't bake flash, and once the integration is complete, you'll be buying flash boards from the ram makers themselves --