Reply to post: Re: @disgustedoftunbridgewells

'Disappearing' data under ZFS on Linux sparks small swift tweak

Nate Amsden Silver badge

Re: @disgustedoftunbridgewells

I tried to deploy Nexenta(Opensolaris) in a high availability role back in 2011/2012, it uses ZFS of course. It wasn't pretty. While I don't blame ZFS directly for the data corruption it was Nexenta's HA that was going split brain causing both systems to write to the disks at the same time.

Where I do blame ZFS though is debugging and recovering from corruption. The general advice is re format and restore from backups. Fortunately the data involved that was lost wasn't critical but I was quite shocked how poor the recovery system was for ZFS at the time. I want to say the tool I was messing with was zdb. All I wanted was a basic force FSCK of the file system, if something is corrupted then wipe out those files/blocks and continue (on most other file systems such blocks may be dumped into /lost+found). But no, when zfs hit that corruption instant kernel panic and boot loop.

I managed to temporarily recover the affected volume once or twice, so many years ago I forgot the exact zdb commands I was using at the time, but eventually it failed again.

Final solution was well to get rid of Nexenta but shorter term was to turn off high availability and just go single node at which point the corruption and crashes stopped.

All that said I do like ZFS, I use it in a few cases, I really hate the 80% file system full and your performance goes to shit though. I also don't like not being able to reclaim space in a ZFS file system by writing zeros because the zeros are compressed to nothing. A co-worker tells me ZFS is getting reclaim support soon perhaps in the form of SCSI UNMAP I am not sure.

Also ZFS on BSD and perhaps linux last I checked does not support online volume expansion (i.e. LUN from a SAN), they want you to add more disks to the system and expand it that way. Wasn't an issue with ZFS on Nexenta/OpenSolaris. I did expand FreeNAS storage on several occasions but had to take the file system offline in order to get it to recognize the additional storage. There was a bug open on this issue on FreeNAS for several years, but the link I have is expired (, last I checked it was not resolved yet.

Ironically enough I tried to replace Nexenta with a Windows 2012 Storage server as a NFS appliance, high availability and all(canned solution from HP). Too many bugs though, hell it took me an 8 hour call with support just to get the initial setup done because the software didn't work right(quick start guide indicated a new 2 node cluster could be installed in a matter of minutes, but the nodes were not able to see each other until I did a bunch of manual work on each one to force their network configs. I had the honor of getting my first custom Microsoft patch for one of the issues, but I was too scared to deploy it. So many issues HP acknowledged and most of them said MS was not going to fix.

After all of that I never actually ended up fully deploying the Windows 2012 storage cluster, it never gained my confidence, and the final nail in the coffin was a file system bug where a deduped volume claimed it was full(it was not, had dozens of gigs free) and went offline. That volume in question was not a critical volume(and the only volume with dedupe enabled), so I don't care TOO much that the one volume went offline.

But that's when Microsoft cluster services shit itself, it shut ALL of the volumes down (half dozen) including the critical ones and refused to bring ANY of the volumes back until that one problematic volume was fixed. Solution to that? was to disable Microsoft high availability as well there too! So at least that one bad volume could fail and not affect any of the other volumes on the system.

Now running Isilon and life is much better, though I do wish it had file system compression, their dedupe is totally worthless operating at 32k level, but other than that the system has been trouble free for almost the past year since it was installed (had 2 support cases that took a few weeks to resolve but neither was critical, though level 2 support on one of the cases was not very good, fortunately I had a friend who worked at Isilon for 12 years and was a very senior tech, he was able to suggest a resolution which fixed the problem(turn on source based routing), the official isilon support rep kept telling me everything is fine and there are problems on my network(there was not), enabling SBR resolved the issue). Oh and Isilon is horribly inefficient with storing/accessing small files, I first tried their SD edge product maybe 2 years ago and it imploded due to their architecture around small files.

The other interim solution we had deployed for a while was FreeNAS, it ran fine, no unscheduled outages(several scheduled outages for filesystem expansion) from that, but also no high availability as well, and the update process scared the crap out of me, so we basically stuck with the same version of FreeNAS no patches nothing for a year or two. High availability is available last I checked when you use their hardware etc, but I did not want to use their hardware.

Still dumbfounds me given the workload we give to NFS it is basically bulk storage (and not even much data on ZFS it was under 1TB (Isilon requires probably 5X the space of ZFS for our datasets) - and really nobody sells NFS appliances for 1TB of data). Nothing transactional(that all goes to block SAN storage). Snapshots, online software upgrades, high availability. But apparently that's still difficult to accomplish in many cases. Our use cases for Isilon have expanded over the past year though as we take advantage of the larger amount of storage it has, probably 10X the size of our previous NFS setup (current cluster is about as small of an Isilon cluster as I could get).

I miss my Exanet cluster from ~8 years ago. Or even BlueArc NFS before I had Exanet. Isilon is doing pretty good for me now though.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019