back to article Boffins promise file system that will NEVER lose data

Six MIT research boffins have demonstrated a system capable of recovering all data in the event of a crash that was previously constrained to high-end theory. The team will October showcase the first albeit slow file system "mathematically guaranteed" to not lose data during crashes. Authors Haogang Chen; Daniel Ziegler; Tej …

  1. Mark 85

    Never lose data?

    I've learned to never say never around the IT world as Murphy is just a heartbeat away.

    1. Anonymous Coward
      Terminator

      Re: Never lose data?

      Depends if this is a mathematically complete proof.

      It's easy to theorise and prove a perfect plan. It's impossible to action and carry it out! :D

      1. url

        Re: Never lose data?

        “In theory, theory and practice are the same. In practice, they are not.”

    2. Tomato42

      Re: Never lose data?

      somehow I doubt they have taken into account disks which lie about data being committed to disk

    3. Trigonoceps occipitalis

      Re: Never lose data?

      "Murphy is just a heartbeat away."

      Isn't that a song by a boy band?

      1. Mark 85
        Devil

        Re: Never lose data?

        I hope not.. or at least until I can copyright it.

      2. MrT

        Boy band?

        Not strictly "boy", but "Out in the Fields" could be rewritten to fit...

        "Stored in the fields,

        the data's just begun.

        Out on the disks,

        Records build one by one.

        Wait! Back it up!

        A thousand files could die each day.

        Murphy's just a heartbeat away."

        With apologies to Gary Moore and Phil Lynott, obviously

        1. Citizens untied

          Re: Boy band?

          Try Leo Sayer's When I need You:

          When I need you

          I just close my eyes and I'm with you

          And all that I so wanna give you

          It's only a heartbeat away

  2. AndrueC Silver badge
    Meh

    No file system can guarantee to protect you against hardware failure though. Always take frequent backups and regularly test them by restoring to a blank system.

  3. Paul Crawford Silver badge

    Will be interesting to see what it turns out to be.

    With a lot of "mathematically proven" systems you end up moving the problems/bugs from the implementation process to the initial specifications, which are often not 100% complete nor correct for anything of reasonable complexity.

    1. Tomato42

      or they just end up so woefully inefficient that they are completely unusable in any production environment

      it's not like the hardware the software is running on is 100% reliable, having software at four nines and hardware at four nines is more often than not "good enough"

    2. Roo
      Windows

      "With a lot of "mathematically proven" systems you end up moving the problems/bugs from the implementation process to the initial specifications, which are often not 100% complete nor correct for anything of reasonable complexity."

      I think "moving" is a bit misleading here. Providing the proof is correct (!!) the code will fully comply with the spec, therefore all the remaining bugs will be in the spec. ;)

      Having said that I don't think I've seen a bug-free spec, formal or otherwise. Formal specs do have an advantage in that you can prove that stuff complying with the spec will have particular properties though. While this kind of work may be viewed as esoteric or irrelevant, it should yield a useful model for other folks to compare/apply to real world file systems.

  4. Infernoz Bronze badge
    Facepalm

    (Open) ZFS is pretty damned good already

    ZFS is paranoid about hardware and connections, and can have multiple layers of redundancy and storage; so why the need for yet another file system?

    No system will never lose data if bad enough stuff happens locally, this is why sensible enterprise solutions include replication of data between multiple data centres too.

    1. Gordan
      Happy

      Re: (Open) ZFS is pretty damned good already

      You beat me to it, I was just about to say "Holy crap, they reinvented ZFS!"

      1. razorfishsl

        Re: (Open) ZFS is pretty damned good already

        ZFS is NOT safe......

        You only have to see the forums and the bug list to understand that it can loose data.

  5. Anonymous Coward
    Anonymous Coward

    Dr. Who and the Giant Robot ...

    Sarah-Jane: "... they say it's unbreakable"

    Dr. "hmmm, I don't like the word unbreakable. Too much like 'unsinkable'."

    Sarah-Jane: "What's wrong with 'unsinkable' ?"

    Dr. "Oh, Titanic, you know. Glug-glug."

  6. Horridbloke

    Sounds better than my last effort..

    A while back I was tasked with implementing a block-level data storage solution intended for enterprise use (though not actually a filing system). The brief handed to me was a hand-scrawled description of several data structures with arrows pointing to them. Data integrity was addressed by the sentence at the very end: "+ data protection features".

    I was very glad to leave that job.

  7. volsano

    One Computer Scientist, he say:

    "Beware of bugs in the above code; I have only proved it correct, not tried it."

    --Donald Knuth

  8. Tom_

    Could work

    Maybe it's an empty, read only file system.

    1. AndrueC Silver badge
      Joke

      Re: Could work

      I once came up with a fantastic compression algorithm. Ratios of 100:1 on random data.

      Sadly the decryption algorithm never worked.

    2. Tom 7

      Re: Could work

      mv file /dev/null

      Those were the days...

    3. This post has been deleted by its author

      1. DJV Silver badge
        Pint

        @Symon

        Hah - haven't seen that for years - cheers!

  9. bpfh
    Angel

    Personal experience...

    I have not had a system crash corrupt "normal" data in years (there were a handful of occasions where we were using McAfee disk encryption on laptops and the encryption got lost.... but a full disk decrypt allowed us to run a checkdisk and not only was data recovered but the system booted perfec'.

    What I have had though is bloody new/recent disks dying, from power outages, from a slight knock well inside their G shock tolerances (running or stopped) or just going into the corner and never coming out of it's eternal sulk...

    I guess this FS does not address that problem ;) But of course, everyone has backups don't they. Don't they? Hello? Hello ? Echo ? Echo echo...

  10. Doctor Syntax Silver badge

    Do they have a name for this? If not let me offer Hubrisfs.

  11. nematoad
    Headmaster

    Sigh.

    "The team will October showcase the first albeit slow file system..."

    Oh, is an "October showcase" anything like a Welsh Dresser, or is it just a clumsy phrase?

    Honestly, I know El Reg is a tech site but please try and use the English language a little more gracefully.

  12. Woza

    Research life cycle in action?

    Given that the statement seems to originate from a press office, I wonder if we're looking at the start of this effect:

    http://www.phdcomics.com/comics/archive.php?comicid=1174

  13. Al Brown

    BFS was pretty good in the 90's

    The Be file system BFS (https://en.wikipedia.org/wiki/Be_File_System) was pretty good in the mid-90s - very fast, excellent querying capabilities and the journalling capabilities meant that a common demo was pulling the plug on the machine in mid operation and then bringing it up again and demonstrating that no data had been lost.

    1. PlinkerTind

      Re: BFS was pretty good in the 90's

      Silent corruption is quite a serious problem too (besides crashing disks or power outage). You remember the old VHS cassettes or old Amiga disks? Today you can not read any on them. Why? Bit rot. Data rots over time, cosmic radiation, etc. So you need checksums to detect flipped bits. Journaling wont catch silent corruption. Only ZFS does. Read the wikipedia article on ZFS to see some research on ZFS superior data corruption protection abilities (better than everything else).

      1. Dave 126

        Re: BFS was pretty good in the 90's

        >Journaling wont catch silent corruption. Only ZFS does.

        And BTFS, though the vibe is that it isn't production ready yet. I guess if 10 billion people used a file system for a year and nobody reported any faults, you wouldn't need a mathematical proof.

        1. Jan 0 Silver badge
          Headmaster

          Re: BFS was pretty good in the 90's

          First find 10 billion people who can accurately detect data corruption!

          Judging by the proof checking standards of El Reg commentards, there might be only 10 thousand people in the world who'd notice the data corruption.

        2. PlinkerTind

          Re: BFS was pretty good in the 90's

          BTRFS has no research confirming it does catch all types of silent data corruption. ZFS is confirmed in at least two separate research papers, that zfs do catch all types of silent data corruption. Read the Wikipedia for links to research groups examining zfs. Btw, phoronix last month lost an btrfs volume, and in the forum thread there are several similar stories. I wouldn't trust btrfs, but it's your data.

  14. Alan Brown Silver badge

    Does it handle

    Losses of the type where the meatsack erases everything and then realises two weeks later that he shouldn't have?

    1. PlinkerTind

      Re: Does it handle

      With zfs snapshots you can always rollback in time if you find out you've accidentally deleted data months later.

  15. Graham Triggs

    Not losing data written to disk is relatively simple, providing you don't screw the hardware - you make sure all data is findable / retrievable before the data itself is committed.

    Software crashes are a bit harder - you need to ensure that you don't overwrite when writing new data, and that you can roll back to previous versions, or roll forward to recover from the incomplete write in a crash.

    Power outages would be much better handled with more resilient components - e.g. all DIMMs come with at least as much solid state as RAM, and backup power to ensure the data is flushed to solid state during a failure; a disk has battery power to flush it's cache; a CPU has it's state written out. Then, when power is restored, each component just needs to restore itself to the known state, and you are good to go.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like