back to article ZFS gets inline dedupe

Sun's Zettabyte File System (ZFS) now has built-in deduplication, making it probably the most space-efficient file system there is. There's a discussion of ZFS deduplication in a Sun blog, which says that chunks of data, such as a byte range or blocks or files, are checksummed with a hash function and any duplicate chunks will …

COMMENTS

This topic is closed for new posts.
  1. Ian Rogers
    Thumb Up

    Lustre

    I'm waiting for replication and de-duplication to come to Lustre, then everyone else can shut up shop and go home...

  2. Anonymous Coward
    FAIL

    Sweet mother of Buddha

    Isn't the whole freakin' *PURPOSE* of backups to duplicate data? So in case the "original" data gets deleted, destroyed, overwritten, etc, *YOU HAVE ANOTHER COPY?!?!!*

    So because my file in the "Monday", "Tuesday", and "Wednesday" folders are all identical, the firmware stores *one* pattern of bytes and points to that the rest of the time?

    What happens when that "master" chunk of data gets borked? Do you see exactly the same "borkiness" every time another location on the disk points at the master chunk? What if there were *ONE* location on the disk that held the header information for an EXE file, and that *ONE* location somehow got FUBAR. Would every EXE file who'se header information was linked to the *ONE* location stop working?

    On the scale of "Good Ideas", I rank this one up there with the steam powered underwater unicycle with built-in shark attractant dispenser.

  3. Anonymous Coward
    Go

    Hot Topic

    Dedup is a hot topic, as seen with EMC's recent DataDomain acquistion ($US2.1 billion).

    Thanks to OpenSolaris, we (the customers) can get the same technology for free now, or if we need support, for a much lower price than from EMC.

    Who's dedup technology will you be using?

  4. raving angry loony

    assumes unique hash = unique data

    Let's hope the hash they use accurately reflects the uniqueness of the content. I'd hate to be the one who has to debug a program where part of the binary was replaced by a block that is slightly different but generates the same hash.

    Admittedly with SHA256 it's improbable, but with a really hot cup of tea one never knows.

  5. Bob H
    Linux

    ORLY?

    I've been watching ZFS for a while now and I really hope it can hit the mainstream for at least servers and perhaps even embedded devices with large storage. In fact the combination of this and VirtualBox might be just the thing I need to switch from Linux to Solaris on my office server.

    Lets hope someone figures a practical way to get this FS onto Linux.

  6. Anonymous Coward
    Grenade

    It's a goddamn waste...

    Apple really should have worked out the kinks in the licensing agreement with Sun over ZFS. Yes, the benefits for the desktop users at the moment are negligible but the way Apple is moving towards multi-core systems a proper future-proof FS should be a given. One can only hope El Jobso has a few engineers locked up in the basement, working feverishly round the clock on nothing but a saline and lsd drip and a will to think different, different, different, different. Maybe then we can finally get rid of HFS+ which is now getting so long in the tooth vampires are getting positively envious.

  7. Napoleon
    Thumb Up

    @AC sweet mother of buddha

    Dedup does not mean that you will have ONLY have one copy of your data at storage level but at file level which is quite different. If you have redundancy at storage level (mirror, raidz[123]) you will of course have redundant copies of data.

    If you have a mirror configuration (without dedup) storing 2 identical files will lead to 4 copies of each block (2 for each file and 2X due to mirror). If dedup is "on" you will still have 2 copies of each data (due to mirror) but files will share identical mirrored blocks.

    So yes it makes sens but you have to understand how it works.

  8. Anonymous Coward
    Anonymous Coward

    Re: Sweet mother of Buddha

    No, it doesn't work like that. You don't eliminate all the duplicates so that loss of a single block destroys everything. It would be really nice, of course, if the file system knew about those duplicates so that it could use them as backups in case a block on the disk goes bad.

    And that's _exactly_ what dedup does in this case. ZFS duplicates (or can duplicate) data on different physical disks in case a disk or just a single block goes bad. The downside of that is that you're losing available disk space. With dedup, you get to claw some of that disk space back.

    For virtual machine images that may have several gigabytes of common blocks cutting down the unmanaged duplicates to managed duplicates is a huge saving. Not only that, but an underlying block cache in the host only needs to cache a block once for both virtual machines ...

    It's no wonder that dedup is such a hot topic at the moment.

  9. Anonymous Coward
    Anonymous Coward

    It's not the fall, it's the sudden stop at the end

    I understand that de-duping userspaces and other things would save a goodly amount of space. And that the underlying RAID would protect the system in case one,or several, hard drives crapped out.

    But the original article specifically mentioned *backups*. And that just made me shudder.

    Our most critical data is stored on double-redundant raid arrays, where it's accessed on a regular basis. We keep a week's worth of dumps in 3 separate physical locations on RAID protected hardware, and have 2 identical servers (also raid protected), that are ready to go live with the activation of a NIC. *IF* (bog forbid) the current app server burst into flames, I could have a backup server up and running in less than a minute. Also, it gets run to 2 tapes, one stored on-site in a fireproof-bombproof-armageddon-proof safe, the other goes off-site.

    And that's just the super-mission-critical data. All the rest of the "stuff" is backed up in triplicate on different dedicated "backup-storage" servers and spooled to tape daily. Restoring a user's junk is one Rsync command line away.

    Storage is cheap. Even our hideously expensive Texas Memory Systems solid-state NAS boxes are cheap compared to downtime, lost productivity, and the ultimate expense and chaos of re-creating lost work.

    Other than using it on backups (which are sacred), I have no problem with deduping. It's a great idea, and I can see lots of potential areas to implement it.

    But not on backups. That way lies madness.

  10. MartinLee
    Stop

    @Sweet mother of Buddha

    Calm down.

    If you point to the same block pattern in 3 locations and edit it in one of those locations, it'll dissociate that location and write the new pattern to disk.

  11. Anonymous Coward
    Boffin

    Re: It's not the fall, it's the sudden stop..

    OK, here goes at picking at your argument....

    You trust dedupe for everything except backup, inferring your OK for live service but not for the backup copy? You trust the tape that you might go back to even though there is really limited crc protection there and not hing like sha256?

    From the blog site quoted :

    "When using a secure hash like SHA256, the probability of a hash collision is about 2^-256 = 10^-77 or, in more familiar notation, 0.00000000000000000000000000000000000000000000000000000000000000000000000000001.

    For reference, this is 50 orders of magnitude less likely than an undetected, uncorrected ECC memory error on the most reliable hardware you can buy. "

  12. Ken Hagan Gold badge

    Is this really a win?

    So you've taken a contiguous sequence of bytes (nice, for modern discs) and replaced it with several disjoint sequences (yuck, since modern discs are essentially serial devices). All in the name of saving a few bytes, when hard disc space has never been cheaper, and at the cost of some CPU and (I presume) some cache thrashing whilst the system figures it out, when the memory wall is the main throttle on performance on most systems these days.

    Clever, but is it really useful, and is the cost in reliability (for assuredly this is a more complicated file system than one without the feature, and complexity breeds bugs) worth it?

  13. David Halko
    Thumb Up

    @AC: ZFS - Sweet mother of Buddha

    Anonymous Coward Posted 21:52 GMT post, "Isn't the whole freakin' *PURPOSE* of backups to duplicate data? So in case the "original" data gets deleted, destroyed, overwritten, etc, *YOU HAVE ANOTHER COPY?!?!!*"

    Robustness and Data Integrity are at the core of ZFS !

    - data CRC checking

    - silent data corruption is corrected

    - user selectable RAID1, RAID5, RAIDZ, RAIDZ2 redundancy

    - user selectable [virtually] unlimited snapshots, to take as many historical backups as you want

    All of these mechanisms take care of original data which may get: deleted, destroyed, overwritten, etc.

    Now - all of this is possible with speeding virtualization for dozens, hundreds, or thousands of virtual machines off of a very small storage system, for any operating system, using a relatively small quantity of memory with dedup in ZFS... leave the user data on an external ZFS server, but with dedup, hundreds of disk images can reside on a very small piece of local or remote storage.

    With compression, [virtually] unlimited snapshots, double parity, no RAID5 write hole, [virtually] unlimited volume size, native iSCSI support, native CIFS support, flash read acceleration, flash write acceleration, dedup nearly here, and the Lustre clustering in 2010... Absolutely nothing production quality is lining up to seriously compare with ZFS in the industry.

    Anyone running virtualization under any other file system other than ZFS is really at a disadvantage...

This topic is closed for new posts.

Other stories you might like