back to article 'Disappearing' data under ZFS on Linux sparks small swift tweak

Maintainers of ZFS on Linux have hustled out a new version after the previous release caused created the impression of data loss. ZFS on Linux 0.7.7 only landed on March 21st, but as this GitHub thread titled “Unlistable and disappearing files”, users experienced “Data loss when copying a directory with large-ish number of …

  1. Hans 1 Silver badge
    Windows

    Report something like this involving NTFS to MS Support, you would need three MONTHS of incessant battering to get the firstline to understand what use a filesystem has and why it is not supposed to lose files or metadata ... so, yeah ...

    I guess dataloss is not really cool, so they removed the commit, are now investigating what exactly caused this ala "How did that ever work?" ... heard that one too often ...

    Better, even, say they claim that commit caused the issue, no need for a new release .... git clone, cherry-pick, build, done.

    1. TonyJ Silver badge

      Let's face it - should've tested copying in different scenarios: large files, small files, mixes, mixes of quantities etc...this is what a file system is supposed to handle as its bread and butter.

      And jeez...let's slip in an obligatory MS bash despite it being nothing to do with them. Classy.

      1. Lee D Silver badge

        Yes, I'm more disappointed that this wasn't picked up by an automated test suite than the fact that it might take a few days to patch.

        Surely someone, somewhere is at least creating daily snapshots of the code and putting it through fuzzers and stress-tests and simulated disk-full situations? If not, why not? This is filesystem code we're talking about, not the backend of some casual game.

        The code is all well and good but testing should be taking up far more CPU-time globally than all the compilation and coding on it across the world put together.

        1. Anonymous Coward
          Anonymous Coward

          testing should be taking up far more CPU-time globally than all the compilation and coding on it across the world put together.

          Only if it can mine bitcoin as a side-effect.

        2. Bronek Kozicki Silver badge

          The bug has nothing to do with the disk being actually full. It is related to the timing of transactions and order in which file entries are created in a directory, which may lead to transient collision of hashes. Which is allowed only up to a certain limit. If exceeded, we have an error poorly reported as "disk full". The problem is that the limit should not be there.

          Of course ideally, this should be covered by existing suite of regression tests. However, because the issue only occurs if the transaction cache is not flushed frequently enough, it is timing dependent. And timing-dependent tests are extremely difficult to write. Writing them in such a way as to exclude false positives and false negatives is impossible unless you enter into white-box testing which then becomes a nightmare to maintain.

          1. Woodnag

            However

            In FreeNAS if a ZFS pool is allowed to become full, the pool becomes unavailable for read or delete. ZFS writes a small file to the disk at the start of every transaction, and so if can't then the transaction is pooched. I hope the ZFS implementation on Linux has fixed that, and BSD implementations such as FreeNAS can follow suit.

        3. Dan 55 Silver badge

          Yes, I'm more disappointed that this wasn't picked up by an automated test suite than the fact that it might take a few days to patch.

          Surely someone, somewhere is at least creating daily snapshots of the code and putting it through fuzzers and stress-tests and simulated disk-full situations? If not, why not? This is filesystem code we're talking about, not the backend of some casual game.

          There was a .ksh in the commit specifically to test this feature, but it looks a bit spartan for what could be a huge data loss bug (as practically any filesystem bug is).

        4. John M. Drescher

          They have a battery of tests that each commit must pass. With that said it's frustrating that the tests failed to detect this.

        5. elip

          Lee, of course testing is happening, at the OpenSolaris derivative distros (two of which looked like they peer-reviewed the code [supposedly] before commit). A trend I've noticed with Linux-first-and-foremost devs is that they tend not to hold much value for testing, or portability, or security, or quality, etc. Yes, my brush is wide enough.

          1. asdf Silver badge

            @elip - Yep. Linux is great until your company decides to dump HP-UX for it to save money and you are responsible for production system up time (so far so good for me arguing but some others I work with not so lucky). Never seen HP-UX kernel panic ever. Had a Linux system crash this week (get your move fast and break things the fsck out of here). Vendor having control of both hardware and software tends to make systems expensive but you generally get what you pay for.

            1. Nate Amsden Silver badge

              Back in 2004 or 2005 the company i was at had a few hpux systems on itanium running oracle. At one point we had a network loop for a few minutes. Fixed it but the out of band management controllers on the itaniums all hung. I believe they had redundant management controllers. Unable to reset them HP support advised to just pull them out and re insert. They were hot swappable. Those itaniums were not my responsibility. Fortunately the people tested it on a non critical system and the box crashed immediately.

              That said several years later after I left the company's largest customer said they would pay to upgrade the infrastructure. But they had to get rid of the (then) red hat linux boxes running oracle and go to hpux or Solaris or something more solid. Not sure what the issues were but I recall my co workers who managed those systems were quite excited to be able to use linux with oracle (maybe RAC too I don't recall) before I left.

              Same company after i left built active active datacenters. But their app stack sucked. At one random time I emailed an ops guy there and he said something along the lines of they were 4 or 5 hours into an outage on their 4 or 5 nines configuration. I had a good laugh

              1. asdf Silver badge

                Linux has its place for sure and won't deny that. Also have seen problems with some of HP stack above the OS which is why I am not chomping at the bit to move onto an Itanium VM either. HP-UX on RA-RISC hardware though is beyond rock solid. Real pity Itanium came along at all tbh.

                1. Anonymous Coward
                  Anonymous Coward

                  I thought HP-UX was dead.

                  HPe doesn't even use it! I'm a contractor for them and their datacenter is made up of Sun/Oracle, and Dell servers.

          2. Anonymous Coward
            Anonymous Coward

            A trend I've noticed with Linux-first-and-foremost devs is that they tend not to hold much value for testing, ... or quality

            At least they fixed it on the first attempt (touch wood).

            I though MS was "embracing" Linux now? Hmmm ... perhaps open source projects need to start vetting their devs?

      2. wallaby

        Agreed TonyJ

        naff all to do with MS but the self defensive remark to start with.

        Mind you, posts about Microsoft have nothing to do with Linux but it doesn't shut them up

      3. Tom 7 Silver badge

        @Tony3 re testing

        Testing needs testing too.Sometimes the first time you meet a failure mode is when it fails - a computer can exercise a lot of things very quickly but spinning rust can slow it down considerably, as can the wrapping to ensure you know what's going wrong - which as anyone who has spent time in gdb will know that simply doing that will hide a multitude of sins.

      4. Hans 1 Silver badge
        Happy

        Let's face it - should've tested copying in different scenarios

        Totally agree, but you cannot test each and every possible case, there are too many permutations. Simple question, define "large number of files" and "large files" ... 10k files, 1Gb in size ? The commit in question addressed issues with mixed case names, where files would have the same name were you to lowercase the names. It is much easier for the users to test their systems with the software and report any issues, it is not like they purchased the code (or a license for it), I think that is "giving back" for greater good.

        obligatory MS bash

        Yes, that is my style and my duty, I have to live up to the "Microsoft Most Hated Professional" moniker !

        Then again, the article was praising open source without context, so my comment added some context, for those in need. There are not that many closed source vendors that produce proprietary file systems to choose from and MS support really sucks when you're tech-savvy.

        1. FatGerman

          This is why we can't have nice things

          >> It is much easier for the users to test their systems with the software and report any issues

          Of course it's easier. But that doesn't mean it's a good idea.

      5. This post has been deleted by a moderator

        1. This post has been deleted by a moderator

        2. Dan 55 Silver badge

          @AC

          The Daily Mail forums are over there.

        3. This post has been deleted by a moderator

          1. This post has been deleted by a moderator

        4. This post has been deleted by a moderator

      6. FrankAlphaXII

        It wouldn't be a Linux article if there wasn't the mandatory neckbeard persecution complex

  2. SJA

    Some multi-billion dollar corporations.....

    Some multi-billion dollar corporations can't even fixed severe security issues that puts millions of users at risk within 90 days....

    1. wallaby

      Re: Some multi-billion dollar corporations.....

      "Some multi-billion dollar corporations can't even fixed severe security issues that puts millions of users at risk within 90 days...."

      and that's relevant to this post how ???????????????

      1. John H Woods Silver badge

        Re: Some multi-billion dollar corporations.....

        "and that's relevant to this post how ???????????????"

        because part of the point of the article seems to saying the rapid fix is a bit of a victory for open source?

        Disclaimer: massive open source fan but slightly perturbed that a ZFS minor version could have been broken so easily. Fortunately I tend to stay away from the bleeding edge.

      2. Anonymous Coward
        Anonymous Coward

        Re: Some multi-billion dollar corporations.....

        > and that's relevant to this post how ???????????????

        Probably because it's a dig at the same company and that ZFS can get rapid fixes whereas non-opensource SW doesn't get fixed in an acceptable manner.

  3. disgustedoftunbridgewells Silver badge

    Who uses a 0.X version of a filesystem in production?

    1. Anonymous Coward
      Anonymous Coward

      "Who uses a 0.X version of a filesystem in production?"

      Can I assume the upvote is an "I do that!"? :-)

    2. Anonymous Coward
      Anonymous Coward

      @disgustedoftunbridgewells

      "Who uses a 0.X version of a filesystem in production?"

      ZFS as a filesystem has been out of beta for years already. And I say this because I've been working with ZFS on FreeBSD for at least 5 years now and before that have been using it on Solaris 10/x86 for another 4 to 5 years. And no data loss because of filesystem hiccups I might add. Heck, on FreeBSD I can even put my whole operating system (so: including the root / kernel) on ZFS which used to be impossible on Solaris (note: it's been a long time since I last used Solaris).

      Now, I'm well aware that you probably meant to address the project and not so much the filesystem, but even so I think this could easily be a reason why some people may be tempted to use this. ZFS has been available and extremely stable for several years and better yet: it's been an open source project as well.

      So I don't think it's that crazy that people expect at least some stability, especially considering the age of the filesystem itself.

      1. disgustedoftunbridgewells Silver badge

        Re: @disgustedoftunbridgewells

        Yes, I meant this particular implementation of ZFS. Obviously the ones in Solaris and BSD are rock solid.

        1. Nate Amsden Silver badge

          Re: @disgustedoftunbridgewells

          I tried to deploy Nexenta(Opensolaris) in a high availability role back in 2011/2012, it uses ZFS of course. It wasn't pretty. While I don't blame ZFS directly for the data corruption it was Nexenta's HA that was going split brain causing both systems to write to the disks at the same time.

          Where I do blame ZFS though is debugging and recovering from corruption. The general advice is re format and restore from backups. Fortunately the data involved that was lost wasn't critical but I was quite shocked how poor the recovery system was for ZFS at the time. I want to say the tool I was messing with was zdb. All I wanted was a basic force FSCK of the file system, if something is corrupted then wipe out those files/blocks and continue (on most other file systems such blocks may be dumped into /lost+found). But no, when zfs hit that corruption instant kernel panic and boot loop.

          I managed to temporarily recover the affected volume once or twice, so many years ago I forgot the exact zdb commands I was using at the time, but eventually it failed again.

          Final solution was well to get rid of Nexenta but shorter term was to turn off high availability and just go single node at which point the corruption and crashes stopped.

          All that said I do like ZFS, I use it in a few cases, I really hate the 80% file system full and your performance goes to shit though. I also don't like not being able to reclaim space in a ZFS file system by writing zeros because the zeros are compressed to nothing. A co-worker tells me ZFS is getting reclaim support soon perhaps in the form of SCSI UNMAP I am not sure.

          Also ZFS on BSD and perhaps linux last I checked does not support online volume expansion (i.e. LUN from a SAN), they want you to add more disks to the system and expand it that way. Wasn't an issue with ZFS on Nexenta/OpenSolaris. I did expand FreeNAS storage on several occasions but had to take the file system offline in order to get it to recognize the additional storage. There was a bug open on this issue on FreeNAS for several years, but the link I have is expired (https://redmine.ixsystems.com/issues/342), last I checked it was not resolved yet.

          Ironically enough I tried to replace Nexenta with a Windows 2012 Storage server as a NFS appliance, high availability and all(canned solution from HP). Too many bugs though, hell it took me an 8 hour call with support just to get the initial setup done because the software didn't work right(quick start guide indicated a new 2 node cluster could be installed in a matter of minutes, but the nodes were not able to see each other until I did a bunch of manual work on each one to force their network configs. I had the honor of getting my first custom Microsoft patch for one of the issues, but I was too scared to deploy it. So many issues HP acknowledged and most of them said MS was not going to fix.

          After all of that I never actually ended up fully deploying the Windows 2012 storage cluster, it never gained my confidence, and the final nail in the coffin was a file system bug where a deduped volume claimed it was full(it was not, had dozens of gigs free) and went offline. That volume in question was not a critical volume(and the only volume with dedupe enabled), so I don't care TOO much that the one volume went offline.

          But that's when Microsoft cluster services shit itself, it shut ALL of the volumes down (half dozen) including the critical ones and refused to bring ANY of the volumes back until that one problematic volume was fixed. Solution to that? was to disable Microsoft high availability as well there too! So at least that one bad volume could fail and not affect any of the other volumes on the system.

          Now running Isilon and life is much better, though I do wish it had file system compression, their dedupe is totally worthless operating at 32k level, but other than that the system has been trouble free for almost the past year since it was installed (had 2 support cases that took a few weeks to resolve but neither was critical, though level 2 support on one of the cases was not very good, fortunately I had a friend who worked at Isilon for 12 years and was a very senior tech, he was able to suggest a resolution which fixed the problem(turn on source based routing), the official isilon support rep kept telling me everything is fine and there are problems on my network(there was not), enabling SBR resolved the issue). Oh and Isilon is horribly inefficient with storing/accessing small files, I first tried their SD edge product maybe 2 years ago and it imploded due to their architecture around small files.

          The other interim solution we had deployed for a while was FreeNAS, it ran fine, no unscheduled outages(several scheduled outages for filesystem expansion) from that, but also no high availability as well, and the update process scared the crap out of me, so we basically stuck with the same version of FreeNAS no patches nothing for a year or two. High availability is available last I checked when you use their hardware etc, but I did not want to use their hardware.

          Still dumbfounds me given the workload we give to NFS it is basically bulk storage (and not even much data on ZFS it was under 1TB (Isilon requires probably 5X the space of ZFS for our datasets) - and really nobody sells NFS appliances for 1TB of data). Nothing transactional(that all goes to block SAN storage). Snapshots, online software upgrades, high availability. But apparently that's still difficult to accomplish in many cases. Our use cases for Isilon have expanded over the past year though as we take advantage of the larger amount of storage it has, probably 10X the size of our previous NFS setup (current cluster is about as small of an Isilon cluster as I could get).

          I miss my Exanet cluster from ~8 years ago. Or even BlueArc NFS before I had Exanet. Isilon is doing pretty good for me now though.

      2. Dr. Mouse Silver badge

        Re: @disgustedoftunbridgewells

        Ditto, I have also been using ZFS since I discovered it while trying out Solaris 10 (as a method of learning more about Solaris for work, where we had several Solaris servers doing various jobs). First use was on Solaris 10, then FreeBSD, and now running ZoL (including root on ZFS) for my home server. It's been very stable for me, although I'd hesitate to recommend it to a client without making them very well aware that there's a risk. Personally, I believe that risk is acceptable with a good backup regime, especially with ZFS's data integrity regime (checksumming everything), but it would be up to the client to decide for their own business.

      3. Alan Brown Silver badge

        Re: @disgustedoftunbridgewells

        "ZFS as a filesystem has been out of beta for years already"

        Yes, and these point sub releases ARE betas at best. Nobody declares these things stable for production until they've been out for quite a while.

        The actual versions being used on production systems are 0.7.5 or older. Everything newer is testing - and precisely because of these kinds of issues.

        If you use bleeding edge, expect to bleed occasionally.

    3. Steve 53

      TBH as with most opensource, don't patch immediately for production. The release was only a couple of weeks old, hadn't even made it into debian testing.

    4. DontFeedTheTrolls
      Headmaster

      In what universe does a version number directly correlate to maturity and stability and therefore exclusively influence ones choice to deploy a product?

      1. Anonymous Coward
        Anonymous Coward

        A professional one?

      2. Dan 55 Silver badge

        A marketing one.

        1. Dan 55 Silver badge

          8 downvoters and not one person commenting about Windows and Office version numbers.

  4. Bronek Kozicki Silver badge

    Reproducer

    For those using ZFS version 0.7.7, the most useful part of the discussion is reproducer script . Note, it should be run more than once.

  5. J J Carter Silver badge
    Facepalm

    Woops!

    The benefits of open source, many eyes! (Unless they're all watching pr0nz!)

    1. Kevin McMurtrie Silver badge

      Re: Woops!

      Open software dies when the safe and minimal patches are not in balance with big and dangerous refactoring keeping it clean.

      1. Bronek Kozicki Silver badge

        Re: Woops!

        Any software dies when it is not kept clean, not only open source. Only in the open source, the motivation to maintain clean codebase is slightly higher, because it is shame to be associated with something of very poor quality (unless it is universally used and few people look inside, for example old OpenSSL)

    2. phuzz Silver badge

      Re: Woops!

      Six eyes is many.

  6. asdfasdfasdf2015

    i think el rag commenters are making a mountain of a mole hill. yes its a bad bug, but it got caught quickly, and has been dealt with swiftly. at least we didn't find this out 6 months from now.

    1. Steve 53

      Doesn't hurt to look at this as a way of informing a rather decent number of technical people who may run ZFS that they want to patch to 0.7.8 PDQ

    2. fixit_f

      You say that – but remember, large companies with deliberately obstructive change management processes may still insist on their own full set of regression tests to be run and signed off before it gets anywhere close to approval for release, then they add arbitrary periods of time on top of that – so the simplest of patches can take weeks to turn around. Also patching tends to require downtime of production systems, which isn’t always feasible.

    3. HighTension

      I don't think this bug is as really that terrible. At least it exposes an error when it happens - there's no silent data loss or corruption thank god...

      1. Bronek Kozicki Silver badge

        Also, no actual file content goes missing - only the directory entries. It is the reason why ext filesystem has /lost+found directory, so ZFS is definitely not first to suffer this fate - except it currently does not have tools to pull the data back (no special directory). I am pretty sure that soon the tools will be made available, too.

  7. Paul Smith

    Goto Jail, go directly to jail.

    Am I the only one to get nervous about the use of 'goto' statements in the code?

    If (err == 0) goto retry;

    Huh? Does zero mean no error, in which case why are they retrying, or is it a recoverable error, in which case why not use the appropriate constant?

    I don't know who approved it or why, but I can see why they didn't spot this problem. I don't know what other problems they have missed but I am sure they are there.

    For a story about the benefits of open source, you have picked a very poor example.

    1. Phil Endecott Silver badge

      Re: Goto Jail, go directly to jail.

      > If (err == 0) goto retry;

      Yes 0 does mean ‘no error’. This is an errno-like status code; errno does not define a symbol for ‘OK’ and using raw 0 or an implicit boolean conversion ( if (!err) ... ) is standard.

      The logic is something like this:

      1. Try to do something.

      2. If it worked, all finished. Stop now.

      3. Do some other special action that should help resolve why step 1 didn’t work.

      4. If step 3 works, go back to step 1 to retry the original thing.

      ‘if (err==0) goto retry;’ is saying my step 3 completed with no error, so it can go back to step 1 to retry the original thing.

      “Goto considered evil” would suggest that it should be a while loop, ‘until no error’. But there are other schools of thought.

      1. Lee D Silver badge

        Re: Goto Jail, go directly to jail.

        If you think that goto is problematic, never look at the kernel source code.

        Goto on its on isn't dangerous, it's ill-considered use of it that is. Though in theory, everything in a nice loop looks pretty, it has a performance hit that goto doesn't. Underneath the hood, goto is literally just a jmp instruction. But a loop has all kinds of setups, stack motions and side-effects.

        Especially in any kind of error handler, you don't want to be rolling around inside a loop that's already served it's purpose, you want to get the hell out of dodge.

        And if we're talking performance-critical filesystem code that will impact upon everything from logging to every process on the system to potential complete kernel failure in the case of a mistake, a goto might well be the best way to handle the practicalities. Don't forget, that error-handling code might be operating under very extreme circumstances, with critical data, with minimal resources to attempt to recover or at least record what happened without damaging the stack, etc. In that case, you really don't want to be faffing about when goto is the answer.

        At last count, there were over 10,000 goto's inside the Linux kernel. To replace them all with further-indented, re-ordered code without hitting the same kinds of non-damaging "emergency" performance? That's just silly, because the people that put them there aren't exactly idiots.

        People who state absolutes, however, are - without exception - idiots...

        1. Phil Endecott Silver badge

          Re: Goto Jail, go directly to jail.

          > Underneath the hood, goto is literally just a jmp instruction. But

          > a loop has all kinds of setups, stack motions and side-effects.

          Nonsense.

          1. Lee D Silver badge

            Re: Goto Jail, go directly to jail.

            @Phil: Elaborate?

            Because I'd like to see you set up a loop which performs an action and then tests for errors after without using/corrupting a register or two, putting in conditional jumps, making unwind-operations for failures more complex, and doing 4-5 instructions more than a "goto errorhandler", on even the most highly optimised of compilers.

            An empty loop might compile to just a jmp the same as a goto, but any conditionals mean register management and shifting, which can have severe implications if you're deep in the middle of performance-critical, interrupt handler, etc. type code under a failure condition.

            It seems that Linus and others agree with me:

            http://koblents.com/Ches/Links/Month-Mar-2013/20-Using-Goto-in-Linux-Kernel-Code/

            The example at the bottom is particularly relevant in terms of error-related rollback.

  8. Anonymous Coward
    Anonymous Coward

    I wish Linux would support the Apple File System (APFS)

    It's proprietary and released recently, but give it a few more years it'll mature and be excellent.

    If Apple wants a licensing fee, let it have it.

    Imagine Macs and a majority of Linux machines using APFS, as opposed to Microsoft's NTFS.

    https://arstechnica.com/gadgets/2016/06/a-zfs-developers-analysis-of-the-good-and-bad-in-apples-new-apfs-file-system/

    1. Anonymous Coward
      Anonymous Coward

      Re: I wish Linux would support the Apple File System (APFS)

      And, prey tell, who and how is going to collect the money for the license ? Let's say that a Linux distro could come up with a couple of hundred millions of dollars to buy the license. How are they going to cover the loss ? Sell copies of Linux ? no problemo but you also have to give the source code (ouch!) so you will not get many buyers unless you lock them in with serial numbers, genuine advantage, DRM (beurk!) to make sure they pay and and also prevent them from sharing the installation disk.

      To me personally, APFS does not worth it if I have to give up my end-user freedoms.

    2. mark l 2 Silver badge

      Re: I wish Linux would support the Apple File System (APFS)

      "Imagine Macs and a majority of Linux machines using APFS, as opposed to Microsoft's NTFS"

      Most Macs and Linux machines won't be using NTFS unless they dual boot with Windows, in which case APFS won't help since you will still require NTFS (or FAT) for the Windows install.

      Should you use Linux and want to read APFS disks you can either use a closed source product from Paragon

      https://www.paragon-software.com/business/apfs-linux/

      Or there is an open source FUSE APFS in experimental stage on Github https://github.com/sgan81/apfs-fuse

    3. foo_bar_baz
      Boffin

      Re: I wish Linux would support the Apple File System (APFS)

      Why do you wish so? Are you a Linux user? Which specific features are you missing from Linux (EXT4, XFS or BTRFS) that APFS offers?

      1. Lee D Silver badge

        Re: I wish Linux would support the Apple File System (APFS)

        Just think practically:

        Unless someone takes the time to code it up, port it over, test its implementation, keeps synced with all the iOS updates and features they throw into it (probably without announcement, code or assistance as this is Apple we're talking about), tests it to the extent that you're happy with putting your data on it, fight the patent fight with Apple, and then works to integrate it into the Linux kernel, it ain't gonna happen.

        "Linux" isn't about a team of guys just putting in your wishlist. The central people do nothing more than approve and critique stuff other people have made. If nobody's made it, it won't get in. I think there's precisely ZIP in terms of Apple code contributions in the kernel, and even MS has huge chunks of their code in HyperV etc. compatibility modules.

        So, if Apple aren't going to do it, and there's no open-source implementation of it (even NTFS had several competing implementations, one utilising the original Windows NTFS.SYS binary via a shim layer!), where's this code going to come from?

        The closest I can find is this:

        https://github.com/sgan81/apfs-fuse

        Which is read-only (like NTFS drivers were for years). I have no idea about Mac version numbers for compatibility so you're on your own there, but it appears to be a wrote-from-scratch, reverse-engineered module using FUSE. It's also a handful of months old. You're going to be several years down the path before that's even close to CaptiveNTFS's standard, which never made it to the kernel.

        And who would benefit? People putting an Apple-formatted disk into a machine that runs Linux. That's a tiny portion of even the most techy of users.

        And, looking at the code in that archive, there's literally nothing in there that's shocking or new or complicated or whole-new-levels of filesystem. It's just a bog-standard bit of coding. Sure, that's not the write-logic, including all the data-safety-guarantees and atomicity required for that (because that's the hard part), but that code is pretty indicative that APFS is really nothing very special at all.

  9. Stevie Silver badge

    Bah!

    I assume Linus Torvalds' usual acerbic commentary on crap commits was lost due to the ZFS bug.

  10. dmacleo

    seems to me caught quickly at least.

  11. J. Cook Silver badge
    Pint

    We need a popcorn icon...

    ... because I added a couple words to my dictionary of the profane and obscene from this thread. :)

    re: APFS - Perhaps someone could reverse engineer it, like what was done for the NTFS driver and CIFS / SMB protocols? Just saying. (And no, I'm not volunteering for that- you don't want me anywhere NEAR that codebase, based on prior experience.)

    re: Microsloth bashing - Considering that it took the Exchange team upwards of 9 months+ to recognize that there was an interaction problem with the 2010 Management Console and the MMC executable that it ran under, I wouldn't be surprised if some goofy bug in, say, ReFS screwed people over for a couple months before the responsible team got around to looking at it. (And yes, I do know about the zombie bug from pay of last year- that's more a case of 'someone trying deliberately do break stuff', but it's not surprising that the bug it leverages exists- Normal admins aren't supposed to be poking around in the MFT unless they really, really know what they are doing...)

  12. Belperite

    No problems with ZFS here

    I've used ZFS on Linux on my home NAS and work PC for a couple of years and haven't had any issues. Stating the obvious but I think that if people are going to be tracking the bleeding edge releases (ZoL development is moving very fast), they should be subbed to the relevant ZFS mailing lists and keeping an eye out for bugs for a while before upgrading.

    Personally, I track the versions that hit Debian Testing / Stretch backports, and then only if there are bugfixes or features I need - the delay seems to allow sufficient time for major bugs to be filtered out.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019