The storage needs of home users are ever growing, such that the capacity of dual-layer DVDs appears miniscule, and backing up CD looks desperate. Many are now turning to network accessible storage systems that not only allow data storage in the home, but also provide HTTP, FTP and cloud services for when you’re on the go. The …
".....The problem is no integrity checking...." Great, so ZFS tells you after you have a problem, and without the ability to use hardware RAID5 underneath to get round the issue. I run fsck via cron which is all scrub is, and so far I've not found any hobbyhorse sh*t on my drives. Maybe it only happens with Sun kit, if you pray to the Great Ponytail hard enough....?
"....Because ZFS' license is not compatible with the Linux kernel's GPL one...." So you're suggesting use a deadend OS like Slowaris x86 then? So, ZFS has a dodgy licence, which Oracle could shaft you with at any point, no development roadmap that Oracle is going to guarantee sticking with, and you want to run it on an Oracle-controlled OS with even worse prospects? Good luck persuading the Penguinistas with that one! Of course, the fact that OCFS2 was designed from the ground up as a resilient, clusterable filesystem that works with Linux couldn't possibly have anything to do with Oracle's decision, right? ROFL! BTW, why exactly was it Apple took one look at ZFS and said "nyet"? I used to use FreeNAS but stopped when they included ZFS, as did many other people I know. Others dropped FreeNAS when version 8 started needing rediculous amounts of RAM to perform.
"....Hmm, I think the NetApp versus Sun/Oracle case was closed on that one after several of the patents were struck down....." Nice try. Sun's own coders admitted they based their design on WAFL, and it has exactly the same space issues, patents or not. That's when it's not crashing and corrupting data all by itself.
"....I have wondered why you have such a problem with anything Sun-related...." Years of suffering the Sunshine makes me less than likely to swallow marketing bumph dressed up as technical knowledge, thanks. There's a reason Sun died and it was because customers got Sunburnt and stopped believing them.
>"Why use Windows? It is not safe and susceptible to data corruption"
>What a load of FUD.... dont forget to wrap some foil around your hard drives
You are right on requesting links, otherwise it would be pure FUD: make lots of strange negative claims without ever backing up with credible links. Here you have a PhD thesis where the research conclude that NTFS is not safe, with respect to data corruption. I suggest you catch up with the latest research if you want to learn more:
Dr. Prabhakaran found that ALL the file systems (NTFS, ext, JFS, XFS, ReiserFS, etc) shared:
"... ad hoc failure handling and a great deal of illogical inconsistency in failure policy ... such inconsistency leads to substantially different detection and recovery strategies under similar fault scenarios, resulting in unpredictable and often undesirable fault-handling strategies. ... We observe little tolerance to transient failures; .... none of the file systems can recover from partial disk failures, due to a lack of in-disk redundancy."
>"...I can get built-in RAID on most PC mobos, it is reliable and cost-effective, does not impact on CPU >performance, why bother with the hassle of ZFS which steals cycles from the CPU?..."
First of all, hardware raid are not safe with respect to data corruption. Here are some information if you want to learn about limitations of hardware raid:
Second, it is true that ZFS uses cpu cycles, a few percent of one core. There is a reason ZFS uses cpu cycles: ZFS does checksum calculations of every block. If you ever have done a MD5 checksum of a file to check data integrity, you know it uses cpu cycles. ZFS does that (SHA256 checksum actually). Hardware raid does not do any checksum calculations for data integrity, instead, hw-raid does PARITY calculations which are not the same thing. Parity calculations are just some XOR easy calculations and hw-raid are not designed to catch bit rot.
Have you ever experienced bit rot of old 5.25" or 3.5" data discs? Old disks doesnt work anymore. This problem also applies to ECC RAM, with time, a powered on server will have more and more random bit flips in RAM, to the point it crashes. That is the reason ECC RAM is needed. Do you dispute the need for ECC RAM? Do you think servers dont need ECC?
>"...I run fsck via cron which is all scrub is, and so far I've not found any hobbyhorse sh*t on my drives..."
Matt, matt. ZFS scrub is not a fsck. First of all, fsck does only check the metadata, such as the log. fsck never checks the actual data, which means the data might still be corrupted after a successful fsck. One guy did fsck on a XFS raid in like one minute, think a while and you will understand it is fishy. How can you check 6 TB worth of data in one minute? That means the fsck read the data at a rate of 100 GB/sec. This is not possible with just a few SATA disks. The only conclusion is that fsck does not check everything, it just cheats. Second, you need to take the raid off line and wait while you do fsck.
ZFS Scrub checks everything, data and metadata, and that takes hours. ZFS scrub is also designed to be used on a live mounted active raid. No need to take it off line.
The thing is, to be really sure you dont have silent corruption, you need to do a checksum calculation every time you read/write a block. In effect, you need to do a MD5 checksum. Otherwise, you can not know. For instance, here is a research paper by NetApp whom you trust?
"A real life study of 1.5 million HDDs in the NetApp database found that on average 1 in 90 SATA drives will have silent corruption which is not caught by hardware RAID verification process; for a RAID-5 system that works out to one undetected error for every 67 TB of data read"
If you look at the spec sheet of a new Enterprise SAS disk, it says one irrecoverable error on every 10^16 bit read. Thus, SAS disks get uncorrectable errors. And Fibre Channel disks are even more high end, and they also get uncorrectable errors:
Matt, again you have very strong opinions on things you have no clue about. It would be good for you if you caught up on research, otherwise you just seem totally lost when people discuss things over your head. And as usual, you never backup any of your strong negative claims, even though we have asked you to do so. Try to handle all that bitterness inside you? It is difficult to try to explain things to you, as you discard even research papers, and you continue to regurgitate things you have no clue of.
BTW, ZFS is free in a gratis and easy distro called FreeNAS. Just set up and forget. It is built for home server. Nas4Free it is called, maybe...?
".....First of all, hardware raid are not safe with respect to data corruption....." Really? And ZFS has no limitations and never crashes and corrupts data? Not according to the online forums!
".....Second, it is true that ZFS uses cpu cycles, a few percent of one core....." On an empty filesystem. As the filesystem fills then ZFS has more work to do just with data checking, let alone shuffling round the disks trying to find spare space with the copy-of-WAFL-throw -the-data-anywhere approach. Then suddenly ZFS is hogging the CPU and demanding masses of RAM just to stop from stalling. And don't even start about encryption as then you need so much RAM it really is beyond f*cking rediculous! ZFS is not just a system hog when the fielsystem fills, it is a system killer.
A streetcleaner machine can clean miles of streets. It can wash and dry and sweep and vacuum as it goes. But I have no intention of using a streetcleaner in my home. Your claims about bit rot - which I have NEVER seen in forty years of computing - are the salesman trying to sell a streetcleaner to the average housewife on the mythical chance she might need to clean a street some day. Pointless.
"....BTW, ZFS is free in a gratis and easy distro called FreeNAS...." Keep up. I pointed out FreeNAS, a FreeBSD-based NAS server, was a superior solution years ago when you first started bleating about ZFS. Some idiot decided to force the FreeNAS community to go with ZFS and it promptly got forked to a non-ZFS distro.
But here is the most telling point about ZFS - no vendor wants it, even for free. Not one single OS vendor has dropped expensive Veritas for free ZFS. Apple took a look and dropped it like a hot potato. ZFS is a hobby filesystem at best, definately not suited to the enterprise, and far too limited for me to trust it with my data. You like it then enjoy it, just quit the paid-for preaching to those of us with better solutions.
"..Really? And ZFS has no limitations and never crashes and corrupts data? Not according to the online forums!.."
Of course ZFS is not bullet proof, no storage system is 100% safe. The difference is that ZFS is built up from the ground to combat data corruption, whereas the other solutions does not target data corruption. They have never thought about that. ZFS is not bug free, no complex software is bug free. But ZFS is safer than other systems. CERN did a study and concluded this. And there are other research claiming this too. It is no hoax.
Earlier, disks where small and slow and very rarely you saw data corruption. Disks have gotten larger and faster, but not _safer_. They still exhibit the same error rates. Earlier you rarely read 10^16 bits, today it is easy with large and fast raids. Today you start to see bit rot. The reason you have never seen bit rot, is because you have dabbled with small data. Go up to Petabyte and you will see bit rot all the time. There is a reason CERN did research on this: they are storing large amounts of data, many Petabytes. Bit rot is a real big problem for them.
Have you seen the spec sheets on a modern SAS / Fibre Channel Enterprise disk?
On page 2, it says:
"Nonrecoverable Read Errors per Bits Read: 1 sector per 10E16"
What does this mean? Does it mean that some errors are uncorrectable? In fact ALL serious disk vendors say the same thing, they have a point about "irrecoverable error". The disk can not repair all errors. Just as NetApp research says: One irrecoverable error on 67TB data read. Read the paper I linked to.
"..Your claims about bit rot - which I have NEVER seen in forty years of computing.."
Let me ask you, Matt, how often have you seen ECC errors in RAM? Never? So your conclusion is that ECC RAM is not needed? Well, that conclusion is wrong. Do you agree on this? This is question A. What is your answer on Question A)? Have you ever encountered ECC RAM errors in your servers?
"... [you] are the salesman trying to sell a streetcleaner to the average housewife on the mythical chance she might need to clean a street some day. Pointless..."
Well, it is not I who did the research. Large credible institutions and researchers did this, such as CERN, NetAPP, Amazon, etc did it. I just repeat what they say. Amazon explains why have never seen these problems: The reason is because you dabble with small data. When you scale up, at large scale, you see these problems all the time. The more data, the more problems.
Amazon explains this to you:
"...AT SCALE, error detection and correction at lower levels fails to correct or even detect some problems. Software stacks above introduce errors. Hardware introduces more errors. Firmware introduces errors. Errors creep in everywhere and absolutely nobody and nothing can be trusted.
...Over the years, each time I have had an opportunity to see the impact of adding a new layer of error detection, the result has been the same. It fires fast and it fires frequently. In each of these cases, I predicted we would find issues at scale. But, even starting from that perspective, each time I was amazed at the frequency the error correction code fired...
Another example. In this case, a fleet of tens of thousands of servers was instrumented to monitor how frequently the DRAM ECC was correcting. Over the course of several months, the result was somewhere between amazing and frightening. ECC is firing constantly. ...The immediate lesson is you absolutely do need ECC in server application and it is just about crazy to even contemplate running valuable applications without it.
This incident reminds us of the importance of never trusting anything from any component in a multi-component system. Checksum every data block and have well-designed, and well-tested failure modes for even unlikely events. Rather than have complex recovery logic for the near infinite number of faults possible, have simple, brute-force recovery paths that you can use broadly and test frequently. Remember that all hardware, all firmware, and all software have faults and introduce errors. Don’t trust anyone or anything. Have test systems that bit flips and corrupts and ensure the production system can operate through these faults – at scale, rare events are amazingly common."
Ok? You have small data, but when you go to large data, you will see these kind of problems all the time. You will see that ECC is absolutely necessary. As are ZFS. That is the reason CERN is switching to ZFS now.
CERN did a study on hardware raid, and saw lots of silent corruption. CERN wrote the same bit pattern all the time on 3000 hardware racks and after 5 weeks, they say that the bit pattern differed in some cases:
"...Disk errors. [CERN] wrote a special 2 GB file to more than 3,000 nodes every 2 hours and read it back checking for errors after 5 weeks. They found 500 errors on 100 nodes."
Matt, how about you caught up with latest research, instead of trying to rely on your own experiences? I mean, Windows 7 has never crashed for me, does this mean that Windows is fit for large Stock Exchanges? No. Your experience can not be extra polated to large scale. Just read the experts and researchers, isntead of trying to make up your own reality?
"....But here is the most telling point about ZFS - no vendor wants it, even for free. ..."
Well, there are many who wants ZFS. Oracle sells ZFS storage servers, typically they are much faster for a fraction of the price of a NetApp server. Here are some benchmarks when ZFS crushes NetApp:
There are many more here:
Nexenta is selling ZFS servers, and Nexenta is growing fast, fastest ever. Nexenta is rivaling NetApp and EMC.
Dell is selling ZFS servers:
There are more hardware vendors selling ZFS, I dont have time to google up them now for you. Have work to do. FreeBSD has ZFS. Linux has ZFS (zfsonlinux). Mac OS X has ZFS (Z-410).
It seem that your wild false claims have no bearing in reality. Again? It would be nice if you just for once could provide some links that support your claims, but you never do. Why? Are you constantly making things up? How are expected to be taken seriously when you talk of things you dont know, and never support anything you say with credible links? Do you exhibit such behaviour at work too? O_o
"....Of course ZFS is not bullet proof, no storage system is 100% safe...." Which is why I like clustering. Now, can ZFS cluster? No.
"....The reason you have never seen bit rot, is because you have dabbled with small data....." Try the three largest single UNIX database instances in Europe. Actually, don't, becasue there is no chance of you working at that level. And that's the big difference - I have experience, you just have marketing bumph.
"....Well, there are many who wants ZFS...." More evasion. I pointed out that not one vendor has dropped expensive Veritas for "free" ZFS and all you do is go off on a tangent. Just admit it and then go stick your head back up McNealy's rectum.
".....Nexenta is selling ZFS servers...." Oooh, a tier 3 storage maker! Impressive - not!
".....Dell is selling ZFS servers...." Dell will sell you an x64 and spread cream cheese on it if you wish. They also sell Linux and Windows Strorage servers, and many, many more than ZFS units. They also do not have an OS of their own and therefore have not dropped Veritas and replaced it with ZFS. Fail.
".....There are more hardware vendors selling ZFS...." More evasion. I asked you for one server vendor that has dropped Veritas for ZFS and the answer is NONE.
"....It seem that your wild false claims have no bearing in reality...." You wouldn't know reality if it kicked you in the ar$e with both feet. You failed AGAIN to answer the point and pretend that naming cheapo, tier 3 storage players is an answer. It's not. Usual fail. Maybe before you do your next (pointless) degree you should do a GCSE in basic English.
Re: I don't get NAS boxes...
"dont forget to wrap some foil around your hard drives"
but I only bought enough foil to make a hat...
No, ZFS can't cluster. This is actually a claim of yours that happens to be correct, for once. Non clustering is a disadvantage, and if you need clustering, then ZFS can not help you. But you can tack distributed filesystems on top ZFS, such as, Lustre or OpenAFS.
"...More evasion. I pointed out that not one vendor has dropped expensive Veritas for "free" ZFS and all you do is go off on a tangent. Just admit it and then go stick your head back up McNealy's rectum..."
Well, I admit that I dont know anything about your claim. But you are sure on this, I suppose, otherwise you would not be rude. Or maybe you would be rude, even without knowing what you claim?
But there are other examples of companies and organizations switching to ZFS. For instance, CERN. Another heavy filesystem user of large data is IBM. I know that the IBM new supercomputer Sequioa will use Lustre ontop ZFS, instead of ext3 because of ext3 short comings:
At 2:50 he says that "fsck" only checks metadata, but never the actual data. But ZFS checks both. And he says that "everything is built around data integrity in ZFS".
If you google a bit, there are many requests from companies migrating from Veritas to ZFS. Here is one company that migrated to ZFS without any problems.
"...Oooh, [Nexenta] a tier 3 storage maker! Impressive - not!..."
Why is this not impressive? Nexenta competes with NetApp and EMC having similar servers, faster but cheaper. Why do you consider NetApp and EMC "not impressive"?
"...More evasion. I asked you for one server vendor that has dropped Veritas for ZFS and the answer is NONE..."
What is your point? ZFS is proprietary and Oracle owns it. Do you mean that IBM or HP or some other vendor, must switch from Veritas to ZFS to make you happy? What are you trying to say? I dont know of any vendor, but I have not checked. Have you checked?
"...You failed AGAIN to answer the point and pretend that naming cheapo, tier 3 storage players is an answer. It's not. Usual fail. Maybe before you do your next (pointless) degree you should do a GCSE in basic English..."
I agree that my English could be better, but as I tried to explain to you, English is not my first language. BTW, how many languages do you speak, and at which level?
Speaking of evading questions, can you answer my question? Have you ever noticed random bit flips in RAM which has triggered ECC error correcting mechanism in RAM? No? So, just because you have never seen it (because you have not checked for it) that means ECC RAM is not necessary? I mean, users of big data, such as Amazon cloud says that there are random bit flips all the time, in RAM, on disks, etc. Everywhere. But you have never seen any, I understand. I understand you dont trust me, when I say that my old VHS cassettes deterioate because of the data begins to rot after a few years. This also happens to disks, of course.
So, I have answered your question on which vendors have seized Oracle proprietary tech: I havent checked. Probably they dont want to get sued by Oracle.
Can you answer my question? Do you understand the need for ECC RAM in servers?
Re: Re: @Matt
"No, ZFS can't cluster....." FINALLY! One of the Sunshiners has finally admitted a simple problem with ZFS! Quick, call the press! Oh, hold on a sec, it doesn't seem to have stopped him from spewing another couple of terrawads of dribbling.
"....Why is this not impressive? Nexenta competes with NetApp and EMC...." If I stick FreeNAS on an old desktop and hawk it on eBay am I "competing with EMC"?
".....What is your point? ZFS is proprietary and Oracle owns it. Do you mean that IBM or HP or some other vendor, must switch from Veritas to ZFS to make you happy?...." Both hp and IBM are a good case in point. Both pay license fees to Symantec to use their proprietary LVM for their filesystems. If ZFS was so goshdarnwonderful as you say, and "free" to boot, surely hp or IBM would be falling over themselves to use ZFS? They aren't. Indeed, corporate users of SPARC-Slowaris still use Veritas for their filesystems rather than ZFS. There is a reason - ZFS is not as good as you think and there are other options, especially on Linux, that are far superior. So for you to come on here and blindly preach on about ZFS as if it is perfection is just going to get you slapped down by those in the know.
".....Do you understand the need for ECC RAM in servers?" Completely irrellevant to the point in hand. It's like saying "oh, you have house insurance, therefore you must have ZFS!" No, I have house insurance because there is a realistic chance that I will need it, unlike ZFS. There is a demonstratable case for ECC RAM. There is not for ZFS, despite what you claim.
I dont understand your excitement of me confirming that ZFS does not cluster? Everybody knows it, Sun explained ZFS does not cluster, Oracle confirms it, and everybody says so, including me. You know that I always try to back up my claims with credible links to research papers / benchmarks / etc, and there are no links that say ZFS does cluster - because it does not. Therefore I can not claim that ZFS does cluster.
Are you trying to imply that I can not admit that ZFS is not perfect, that is has flaws? Why? I never had any problems looking at benchmarks superior to Sun/Oracle and confirming that, for instance, that POWER7 is the fastest cpu today on some benches. I have written it repeatedly, POWER7 is a very good cpu, one of the best. You know that I have said so, several times. I have confirmed superior IBM benchmarks, without any problems.
Of course ZFS has its flaws, it is not perfect, nor 100% bullet proof. It has its bugs, all complex software has bugs. You can still corrupt data with ZFS, in some weird circumstances. But the thing is, ZFS is built for safety and data integrity. Everything else is secondary. ZFS does checksum calculations on everything, that drags down performance, which means performance is secondary to data integrity. Linux filesystems tend to sacrifice safety to performance. As ext4 creator Ted Tso explained, Linux hackers sacrifice safety to performance:
"In the case of reiserfs, Chris Mason submitted a patch 4 years ago to turn on barriers by default, but Hans Reiser vetoed it. Apparently, to Hans, winning the benchmark demolition derby was more important than his user's data. (It's a sad fact that sometimes the desire to win benchmark competition will cause developers to cheat, sometimes at the expense of their users.)...We tried to get the default changed in ext3, but it was overruled by Andrew Morton, on the grounds that it would represent a big performance loss, and he didn't think the corruption happened all that often (!!!!!) --- despite the fact that Chris Mason had developed a python program that would reliably corrupt an ext3 file system if you ran it and then pulled the power plug "
I rely on research and official benchmarks and other credible links when I say something. Scholars and researchers do so. You, OTOH, do not. I have showed you several research papers - and you reject them all. To me, an academic, that is a very strange mindset. How can you reject all the research on the subject? If you do, then you can as well as rely on religion and other non verifiable arbitrary stuff, such as Healing, Homeopathy, etc. That is a truly weird charlatan mindset: "No, I believe that data corruption does not occur in big data, I choose to believe so. And I reject all research on the matter". Come on, are you serious? Do you really reject research and rely on religion instead? I am really curious. O_o
So yes, ZFS does not cluster. If you google a bit, you will find old ZFS posts where I explain that one of the drawbacks of ZFS is that it doesnt cluster. It is no secret. I have never seen you admit that Sun/Oracle has some superior tech, or seen you admit that HP tech has flaws? On my last job, people said that HP OpenVMS was superior to Solaris, and some Unix sysadmins said that HP Unix was the most stable Unix, more stable than Solaris. I have no problems on citing others when HP/IBM/etc is better than Sun/Oracle. Have you ever admitted that Sun/Oracle did something better than HP? No? Why are you trying to make it look like I can not admit that ZFS has its flaws? Really strange....
"...If I stick FreeNAS on an old desktop and hawk it on eBay am I "competing with EMC"?..." No, I dont understand this. What are you trying to say? That Nexenta is on par with FreeNAS DIY stuff? In that case, it is understandable that you believe so. But if you study the matter a bit, Nexenta beats EMC and NetApp in many cases, and Nexenta has grown triple digit since its start. It is the fastest growing startup. Ever.
Thus, FreeNAS PC can not compete with EMC, but Nexenta can. And does. Just read the articles or will you reject the facts, again?
"...Both hp and IBM are a good case in point. Both pay license fees to Symantec to use their proprietary LVM for their filesystems. If ZFS was so goshdarnwonderful as you say, and "free" to boot, surely hp or IBM would be falling over themselves to use ZFS? They aren't. ..."
Well, DTrace is another Solaris tech that is also good. IBM has not licensed DTrace, nor has HP. What does that prove? That DTrace sucks? No. Thus, your conclusion is wrong: "If HP and IBM does not license ZFS it must mean that ZFS is not good" - is wrong because HP and IBM has not licensed DTrace.
IBM AIX has cloned DTrace and calls it Probevue
Linus has cloned DTrace and calls it Systemtap
FreeBSD has ported DTrace
Mac OS X has ported DTrace
QNX has ported DTrace
VMware has cloned DTrace and calls it vProbes (gives credit to DTrace)
NetApp has talked about porting DTrace on several blogs
Look at this list. Nor HP nor IBM has licensed DTrace, does that mean DTrace sucks? No. Wrong conclusion of you. DTrace is the best tool to instrument the system, and everybody wants it. It is best. Same with ZFS.
"...There is a reason - ZFS is not as good as you think and there are other options, especially on Linux, that are far superior..." Fine, care to tell us more about those options that are far superior to ZFS? What would that be? BTRFS, that does not even allow raid-6 yet? Or was it raid-5? Have you read the mail lists on BTRFS? Horrible stories of data corruption all the time. Some Linux hackers even called it "broken by design". Havent you read this link? Want to see? Just ask me, and I will post it.
So, care to tell us the many superior Linux ZFS options? A storage expert explains that Linux does not scale I/O wise, and you need to use real Unix: "My advice is that Linux file systems are probably okay in the tens of terabytes, but don't try to do hundreds of terabytes or more."
"...There is a demonstratable case for ECC RAM. There is not for ZFS, despite what you claim...."
Fine, but have you ever noticed ECC firing? Have you ever seen it happen? No? Have you ever seen SILENT corruption? Hint, it is not detectable. Have you seen it?
Have you read experts on big data? I posted several links, from NetApp, Amazon, CERN, researchers, etc. Do you reject all those links that confirm that data corruption is a big problem if you go up in scale? Of course, when you toy with your 12TB hardware raid setups, you will never notice it. Especially as hw-raid is not designed to catch data corruption. Nor SMART does help. Just read the research papers. Or do you reject Amazon, CERN and NetApp and all researchers? What is it you know, that they dont know? Why dont you tell NetApp that their big study on 1.5 million Harddisks did not see any data corruption at all? They just imagined the data corruption?
"A real life study of 1.5 million HDDs in the NetApp database found that on average 1 in 90 SATA drives will have silent corruption which is not caught by hardware RAID verification process; for a RAID-5 system that works out to one undetected error for every 67 TB of data read"
Are you serious when you reject all this evidence from NetApp, CERN and Amazon, or are you just Trolling?
"I dont understand your excitement of me confirming that ZFS does not cluster?...." Oh, I see - you're not going to deny the problem, just deny it is a problem. If it cannot cluster it cannot be truly redundant, whereas free options for Linux can. Anyone buying or building a home NAS thinking they are getting 100% reliability and data safety/redundancy should think again. Trying to pass off ZFS as the answer to all issues is not going to help these people when their NAS dies and they think "But Kebabfart said ZFS would solve all my problems?"
"....You, OTOH, do not...." What, now you're saying ZFS does cluster? That's the difference - I stated a fact you could not deny, whereas you just presented opinion pieces long on stats and blather but a little short on undisputed facts.
"....IBM has not licensed DTrace, nor has HP. What does that prove?...." That they don't need Dtrace, just like they don't need ZFS, because they have better options.
"....Have you ever seen SILENT corruption? Hint, it is not detectable. Have you seen it?...." Have you ever seen a GHOST? Hint, they are not detectable. Have you seen one? Hey, look - I can make a completely stupid non-argument just like Kebbie's!
"....Do you reject all those links that confirm that data corruption is a big problem if you go up in scale?...." NAS box, four disks. Even in my paranoid RAIDed cluster, only eight disks. Scale?
"....Are you serious..." Well it is hard to take anything you post with any measure of seriousness. FAIL!
What sort of useful test is that?
I'd only buy one of these if it was good for Raid 5 or 6.
I agree, and as someone that looked EVERYWHERE for a nice 5-hot-swappable-external bay enclosure (with a 6th internal for the OS drive), they don't exist.
I ended up going with a Chinese case that had 5 tool-less internal bays, and a lower tray that can hold 2 more. Anyone that makes their own NAS knows you need at least RAID5, and if you have ever lost a RAID5 NAS, you know you really should have had a RAID6 array.
I prefer using software RAID6 (a la Linux mAdam) because it doesn't lock me into an expensive hardware card that has to be replaced by the exact same model in the event of a failure. Given the speed of CPUs (i3) and my demands (at most 4 requests at a time) software-RAID affords me the flexibility of moving my array to any flavor of Linux that supports mAdam, and OSS gives me numerous management and diagnostic tools to build/diag/repair all manor of issues that might pop up (Windows 2003 offered next to zero tools to deal with software RAIDs) . The only thing that could cause me to lose data at this point would be to lose 3 of my 5 drives at the same time.
"RAID 5 or 6"
No, no, and thrice no! Parity = bad.
Re: "RAID 5 or 6"
A RAID5 array with 4 drives (1TB + 1TB + 1TB + 1TB = 3TB) and I can lose 1 disk before I lose data.
A RAID6 array with 5 drives (1TB + 1TB + 1TB + 1TB + 1TB = 3TB) and I can lose 2 disks before I lose data.
A RAID10 with 4 drives (1TB + 1TB + 1TB + 1TB = 1TB) and you can lose all but one before you lose data.
RAID10 great for speed and redundancy, and bad on storage space.
RAID5 is ok for speed, bad for redundancy (if you lose 2 at once), and great for space.
RAID6 is ok for speed, great on redundancy, and ok for space.
I guess I should have added. "If you are not rich, and don't have infinite space, use RAID6, otherwise use RAID10." I would guess most people buying a sub $500 NAS don't have an infinite budget.
Re: "RAID 5 or 6"
Some interesting calculations going on there!
Pretty sure 4 x 1TB drives in RAID10 gives 2TB of usable space, not 1TB :)
Also RAID6 is better on redundancy than RAID10 as ANY two disks can fail in RAID 6 (due to distributed parity) however in RAID10 it depends:
Remember that RAID10 is just mirrored arrays (RAID1) inside a striped array (RAID0) if two disks fail from the separate mirrors, no biggie, the array can be rebuilt. If both disks are from the same mirror, you've lost all your data! (How often do two disks fail at the same time for small arrays like this anyway? And how unlucky would you have to be for both of them to be in the same mirrored array?!)
But for the performance you get over either of the other implementations it might be worth the lost in capacity of both and redundancy of RAID6 (especially if you add further RAID1 levels within the RAID0 array - 3 RAID1 arrays would mean [almost] 3x the performance of a single disk!).
Any feedback which units support iSCSI? Also clarification if shared CIFS/NFS mounts are support would be helpful?
I can't speak for any of the other brands but the QNAP software (I have a TS-410) supports iSCSI, CIFS, NFS and a whole bunch more. It supports dynamic disk expansion so adding more/larger disks doesn't mean you lose access to your data while it does its thing.
As for the whole "why not roll your own" argument, well to be honest, you're paying for the convenience more than anything else. The HP microservers mentioned elsewhere are nice bits of kit, but AFAIK, they don't support hot swapping drives with the stock BIOS (whereas a lot of the NAS units will support hot swapping).
My TS-410 acts as a focal point for our movies (happily feeding multiple Apple TVs running XBMC), stores our photos (which are backed up to S3 and Crashplan) and also acts as a backup destination for our home machines.
It also runs Sickbeard with Sabnzbd and wakes up a hibernating XBMC client via WOL to update the shared mysql media library (also on the QNAP) when something new has arrived.
I spend most of my days solving IT related FUBARs so when I get home, I don't really want to do that all over again. The QNAP is a bit of kit that I can just leave to get on with it knowing that if there is an issue, it will either email me (assuming it can) or I can get some guidance from a helpful user community. The most serious issue I've had with it was when I found it flashing lights on two drives claiming they were degraded/not available (the unit has 4 x 2TB drives running in RAID5). Turned out it was caused by a brief power outage (and the drives were fine after a complete power cycle), following which my next purchase was a UPS to prevent a repeat.
not enough bays
Disks are crap - and large ones can be expected to regularly provide corrupted data which their ECC hasn't picked up (statistically it's about 4 sectors on a 2Tb drive if you read from end to end). 4 drives isn't enough for decent raid levels and raid has "issues" compared with more advanced systems such as ZFS (which is designed form the ground up with the assumption that not only do disks fail, their ECC is flakey, so detects and CORRECTS such errors)
Seriously, with the amount of stuff that people are piling into their media servers, 20Tb isn't that much anymore and for proper resiliance with large drives you need 7 of 'em to ensure good metadata spread.
These external NASes are far too much, compared with simply shoving 4 or more drives into a low spec PC and installing FreeNAS or similar as the OS.
Re: not enough bays
While possibly/probably effective, your solution does not work out of the box and relies largely on self support.
These NASes can be in use for file storage within 10 minutes of opening the box.
And for compactness, a NAS is hard to beat, a Synology 413j Slim gives 4 disk RAID in a box "120 X 105 X 142 mm" which would sit comfortably next to the TV in the lounge or on the desk in the study.
Re: not enough bays
"....with the amount of stuff that people are piling into their media servers, 20Tb isn't that much anymore...." So stick it on the cloud and let someone else look after it on proper arrays, which make ZFS look like the toy software it is. ZFS can't cluster and offers SFA resilience as it can't even work properly with hardware RAID. Seriously, get a high-speed internet connection and leave the media on iTunes, Amazon, StorageMadeEasy, Microsoft or some other cloud where it will be protected and replicated between massive datacenters, and probably at less cost than buying a four-slot NAS every couple of years and backing it up yourself. 99% of the cruft stored on home NAS units could be stored on the cloud with a little thought and planning, even by as simple a method as emailing it to yourself in Hotmail. If you're feeling paranoid then encrypt it before you store it but you will have to accept the penalty of having to decrypt it before you can use it again.
Re: not enough bays
Seriously, you think that a home/small business internet connection can support access to 20TB of data in the cloud?
Re: not enough bays
You really are either a clueless moron or a piss taking enterprise BOFH:
1. Yes people do needs lots of space now, especially SMEs; no they won't pay enterprise prices, ever!
2. The cloud is WAY too expensive and slow. for 20TB data; and the costs and risks will shock you!
3. The internet is hideously slow even on 80Mbit fibre for this volume of data, and congestion and latency can be horrible compare to a local NAS.
4. Mailing multiples of your mailbox capacity to yourself in Hotmail; you must be on Class A drugs!
5. ZFS is pretty much as good as it gets, and free in FreeNAS; I know I use it a lot!
I won't even discuss the rest, it's completely irrelevant, especially enterprise level stuff like clustering!
Re: not enough bays
These off-the-shelf NAS may be pretty and small, but they have tiny disk capacity, poor and noisy cooling, cheap PSUs, and they all use unsafe logging filesystems.
My MIDI PC box has two hidden filtered 120mm slow quiet fans to keep 5 hard disks cool (total space for 8 dampened disks), has a cool dual core AMD E-350 Mobo with 8GB RAM, and a very over rated PSU, to hosting FreeNAS 8.3; it is quiet, attractive and only uses 50W; all at a big saving on an off-the-shelf NAS, and more capable too. I have upgraded the OS several time since the NAS was built; no rebuild required.
My next FreeNAS box will have a lot more capacity and possibly a low power i3, given I realise that although the CPU was not stressed at high load, the I/O bandwidth probably is, so I need to go for a more powerful mobo and CPU.
There is plenty of support and quicker too for FreeNAS, given they have full docs, a forum, and an IRC channel on-line, this easily beats most commercial support e.g. when an OS upgrade messed up remounting my RAID array, I discussed dthe issue via IRC, a fix for the issue was rolled into an update release, and I was up and running again within an hour; IMO better than phone support :)
Re: not enough bays
Putting an array together is 5 minutes of work. You Google it once and you are set for the next 5 years or however long your setup manages to meet your requirements.
Just knowing "what buttons to push" on an appliance is going to put you way beyond the skill or comfort level of most people. The shiny happy interface (or lack of one) really isn't the biggest problem here.
4 disks just isn't enough. Not enough bays to handle redundancy or parity and hot spares and such.
Re: not enough bays
What makes you think there are only 4-bay models?
A quick look at the QNAP site shows they have models with 1,2,4,5,6,8,10 and 12 bays. The latter ones running with beefier intel cpu, not atoms.
And as for building one yourself: sure, why not, just like you can build your PC yourself. Some people prefer that, others go for a pre-build model. Another advantage of these NAS boxes is they are *very* compact, and most certainly use lower power than anything you build yourself. And no worry about hardware compatibility, the OS that comes with them supports its hardware, something that isn't automatically so for build-your-own boxes.
Re: not enough bays
Intel parts aren't nearly as power hungry as they used to be. Power management is a lot better across the board. So there are fewer and fewer reasons to shell out the cash for an appliance.
...and while there are more "robust" appliances, those are even more rediculously overpriced than the small ones that are the subject at hand.
PARITY = BAD
Do not use parity. You will regret it.
Re: not enough bays
> Putting an array together is 5 minutes of work.
And then the thick end of a day for it to actually build the array :-)
Re: Re: not enough bays
"You really are either a clueless moron or a piss taking enterprise BOFH...." Well, abit of the latter really - I work with enterprise kit but have completely different requirements at home. And I do like taking the piss out of morons like you.
"....1. Yes people do needs lots of space now, especially SMEs; no they won't pay enterprise prices, ever!..." So they don't. They buy stuff like the Microserver mentioned. If their business grows they move up to the SMB ranges from people like hp or Dell.
".....2. The cloud is WAY too expensive and slow. for 20TB data; and the costs and risks will shock you!...." It's called storage tiering, it works for individuals as well as big corporations. Stuff of low importance - back it up to writeable DVD; stuff of high importance - stick it on the cloud. Who said anything about 20TB?
".....3. The internet is hideously slow even on 80Mbit fibre for this volume of data, and congestion and latency can be horrible compare to a local NAS....." Yes, but do you look at every item on your NAS and require it instantly? Most people I know actually treat their home NAS more as an archive - stuff they have finished with gets shifted off their laptop/desktop to be stored on the NAS. If you need constant access then a home fileserver would probably be a better idea than a NAS.
".....4. Mailing multiples of your mailbox capacity to yourself in Hotmail; you must be on Class A drugs!...." Storage tiering - it's an easy way to store important docs, I can send myself encrypted material if I'm worried about MS (or hackers) taking a peek, and I can access them from just about any device with Internet connectivity from anywhere in the World. For example, I keep scans of my passport and other travel docs in an encrypted and compressed file in Hotmail, and it was a lifesaver when my hotel room was burgled in Beiruit. I've been doing it roughly since Hotmail was launched. You can also be naughty and run several Hotmail accounts to spread the load and ensure one hacked account doesn't mean you lose everything, just don't call them something obvious like firstname.lastname@example.org, email@example.com..... And Hotmail now comes with free online Office for editing if you're really stuck somewhere with nothing but a smartphone. Try a little thinking outside the box before you start shrieking about drug-use.
"......5. ZFS is pretty much as good as it gets, and free in FreeNAS; I know I use it a lot!..." Ah, I see your rabid and frothing response is not based on any calm and rational thought as much as a Sunshiner desire to defend your Holy ZFS. I can't help it if your love of ZFS makes you blind to better and simpler solutions, and - frankly - I couldn't give a damn if you're too stupid to consider other options. Your loss.
"......I won't even discuss the rest, it's completely irrelevant, especially enterprise level stuff like clustering!" Really? Why not? Because your product can't do it. I can make two cheapo Linux servers and set up clustering between them. I can do the same with Windows. But you can't do it so you refuse to discuss it. True, the average home user won't think of it, they may actually think that buying a NAS means they have resilience and 100% data availability. I work with enterprise kit so I tend to think the more resilience the better, and seeing as I have access to lots of excess kit whenever we hit the three-year refresh cycle, it's pretty easy for me to implement at home. It's like the saying goes, ask a London cabbie what the best family car is and he won't say a BMW or Ford, for him it's a black cab. For you it's obviously a soapbox kart, but that's your problem.
Re: not enough bays
"... ZFS can't cluster and offers SFA resilience as it can't even work properly with hardware RAID..."
Matt, matt. As I tried to explain to you, hardware raid are not safe. I have showed you links on this. And NetApp research says that too, read my post here to see what NetApp says about hardware raid. There are much research on this. Why dont you check up and read what the researchers in comp sci says on this matter, instead of trusting me?
OTOH, researchers say that ZFS protects against all the errors they tried to provoke, and concluded that ZFS is safe. When they tried to provoke and inject artificial errors in NTFS, EXT, XFS, JFS etc - they all failed their error detection. But ZFS succeeded. There are research papers on this too, they are here (papers numbered 13-18):
And you talk about the cloud. Well, cloud storage typically use hw-raid which, as we have seen, are unsafe. And the internet connection is not safe too, you need to do a MD5 checksum to see that your copy was transfered correctly. You need to do checksum calculations all the time. Just what ZFS does, but hw-raid does not. Therefore you should trust more on your home server with ECC and ZFS, than a cloud. Here is what cloud people says:
"...Every couple of weeks I get questions along the lines of “should I checksum application files, given that the disk already has error correction?” or “given that TCP/IP has error correction on every communications packet, why do I need to have application level network error detection?” Another frequent question is “non-ECC mother boards are much cheaper -- do we really need ECC on memory?” The answer is always yes. At scale, error detection and correction at lower levels fails to correct or even detect some problems. Software stacks above introduce errors. Hardware introduces more errors. Firmware introduces errors. Errors creep in everywhere and absolutely nobody and nothing can be trusted...."
Matt, read and learn?
Re: Re: not enough bays
"... As I tried to explain to you, hardware raid are not safe..." Usual Kebabfart - lots of blather, lots of evasion, no answers to the point raised. Come on, just admit it, you can't cluster ZFS, it introduces a big SPOF into any design. For hobby NAS, provided you can afford to pay for rediculous amounts fo RAM and CPU, it might be passable, but there are far better solutions that can work with lots less hardware AND can be clustered if required.
Someone forgot to tell you, Sun is dead. Stop trying to flog a dead horse, they won't give you anymore paid-for blogging awards.
As various posters have mentioned, ZFS (available as part of FreeBSD) is the only reliable FS available for free. Trouble is, Sun hasn't open sourced the version with native encryption yet, and alternatives (GELI) are frankly a PITA.
Honestly, setting up and using FreeNAS on some old machine with ZFS is dead easy, gives plenty of early warning when a drive is going dubious. However, you do need 8GB of RAM as practical minimum.
Re: native crypto
You should never use FS level crypto - opt for PV level one instead (only /boot is open, everything else including swap partition is encrypted with passphrase no shorter than 24 characters).
Thecus? Recommended? Misguided Fool!
I was interested in the artical up untill I got to the bit where Thecus was recommended. I own a Thecus N2200Plus box and it is utterly crap. Most of the features don't work, the support from Thecus is appaling. The support forums are littered with peoples distrss stories and I have personal experience of loosing data when the Raid array just stopped working for no apperent reason.
ReadyNAS v2 for 280? Really
Just bought one from Amazon to replace my original Infant NV+ for £145 empty...
Weird NAS selection
Certainly on the QNAP part, as both models are in fact LOW-end models, not high-end as the review says... If they had taken a TS459 or even TS-469 it would have blown the competition away (my TS-269 saturates gigabit (100MB/s) and needs dual lan + beefier switch to deploy its full potential).
Given the selection of models, it is easy for the reviewer to steer the outcome of the article.
Re: Weird NAS selection
It does seem strange that the choice of QNAP appliance wasn't the same level in the range as the Synology one. They are generally more expensive though so maybe that had a bearing but I agree that the 459 would have been a better choice and achieves over 100MB/s writes. I used to have the tower system running linux but moved to a QNAP as it can sit in the lounge and is small and quiet in operation. I like the appliance nature. My decision may have been different if the HP server people have was available then.
One critical issue in my view is data integrity. That is what a NAS it supposed to do, store data reliably. But the article fails to address that. Do they support internal file systems that have data checksums (like ZFS)?
If not (and important even with ZFS) do they support automatic RAID scrubbing where periodically all of the HDDs are read and checked for errors in the background.
Most folk at home will only have 1 HDD of protection (RAID-1 or RAID-5) and what happens later in life is a HDD fails, you replace it and find bad sectors on the other disk(s), thus corrupting the valuable data. With two HDD of protection (e.g RAID-6 or ZFS' RAID-Z2) you can cope with one error per stripe of data while rebuilding, but that is not always enough.
That is why you want to check once per fortnight/month that the HDD are all clean, and so so allow the HDD to internally correct/re-map sectors that had high error rates when read, and if necessary to re-write and uncorrectable ones from the RAID array if that fails.
Of course, sudden HDD failure happens, maybe even multiple HDDs, or PSUs, as does "gross administrative error", which is why you should all repeat "RAID is not a backup" twice after breakfast...
Re: Data integrity?
I think most NAS models support disk checks. My QNAP monitors SMART and can be scheduled to do quick but also extensive disk tests looking for bad blocks.
Sadly no ZFS (yeT)
The problem with simply monitoring the SMART status is it won't know about bad sectors until you try to read them. Often by then it is too late.
Smart has support for a surface scan, and while that allows marginal ones to be re-written, it just report any uncorrectable/re-mappable sectors as bad and you won't generally know about that until a HDD fails and you need to re-build the array.
Hence the advantage of the RAID scrub process:
1) It accesses all of the HDD sectors (or all in-use ones in the case of ZFS), forcing the HDD to read and maybe correct/re-map any that are marginal, just as the SMART surface scan will do.
2) For any that are bad, it, by virtue of being in a RAID system, can then re-write any bad sectors with the data from the other HDD(s) and that will normally 'fix' the bad sector (as the HDD will internally re-map a bad one on write, and you still see it as good due to the joys of logical addressing).
Recent Linux distros like Ubuntu will do a RAID scrub first Sunday of the month if you use the software RAID, which is good. But I don't know of any cheap NAS that pay similar attention to data integrity.
Not counting RAID-0, OK?
Re: Data integrity?
Not sure if the QNAPs will ever get ZFS as I believe its memory requirements for good operation exceed what most boxes will have - I believe 1GB per TB of storage is recommended with typically 8GB min. for good performance. My TS-439 has 1GB as do most others.
Don't waste your time with any of these, an N40L with 8gb of ECC RAM and an intel NIC (N40L built-in does not do jumbo frames) will wipe the floor performance wise. It has an internal USB slot onto which you install FreeNAS and then you get ZFS.
ZFS + RAIDZ2 + ECC memory - don't trust your precious data to anything less.
For when the world isn't perfect
I use NAS for backups so I like to see some protection against the usual problems.
What happens when a power failure interrupts writes? What happens when the NAS is in redundant mode and a disk fails? Does it send an e-mail, blink an LED that will never be seen, or pretend like nothing is wrong? What happens when a failed drive is replaced? Can bundled drives be replaced under warranty without long downtime? There are plenty of NAS out there that claim RAID 5 protection but are unusable for days when something goes wrong. I recall and old D-Link and a more recent LaCie 5big that needed to be wiped clean and shipped for warranty drive replacement. Even if they had simply sent me a new drive, they would have needed days to rebuild too. I don't like being without backups for days/weeks so I end up buying a different brand of NAS and giving away the old one when it comes back. What a waste of money.
Re: For when the world isn't perfect
QNAPs will email alerts, same goes for Synology I would imagine. As for power interruptions, if you worry about your data enough to be using a RAID equipped NAS then I suggest you spring for an APC UPS that can send notifications via its USB connector that the NAS will act upon (configurable in the GUI). I used to have a UPS on my PC before I bought the NAS to guard against power failures as it seemed only sensible. Array rebuild time will be a function of the processor as it's doing a fair amount of work. 2TB disk replacement caused a rebuild taking hours on a QNAP rather than days. It will also real-time sync to an external backup, send data to Amazon S3, Elephant drive or sync to another remote NAS. Both companies have built-in SSH amongst other things on their appliances.
Re: For when the world isn't perfect
FYI - smallnetbuilder.com is the site to checkout on these matters.
Yes, we get it, ZFS does some neat stuff. Guess what? Most people (myself included) find it easier to just run a regular (in my case, weekly) backup of important data to an external drive connected to the NAS box via USB. (I also make a weekly clone of my computer's drive on the same day. Job done.
As for why I bought a ready-build NAS appliance: I did so for the same reason I prefer to live in ready-built homes. My time is worth money. I'm worth £300 day as a technical author. (And that's cheap. Some charge as much as £700 / day.) I'm not a fan of UNIX in any of its flavours, so setting up even a FreeNAS box isn't something I enjoy. I'd spend hours perusing the Web to find out the best practices, the arcane spells that need to be typed into the shell, and so on. On top of which, I'd also have to order all the parts and wait for them to be delivered.
Why the hell would I waste £600 or more of my time (and days of my life) working on a device I can just buy off the shelf for less than half that, and which would be up and running within minutes of my taking it out of the packaging?
Just because YOU enjoy a bit of DIY in your preferred field of expertise, it does not follow that everyone else does too. My background is in software, not hardware. I know how the latter works, and I've built dozens of PCs over the years – mostly for relatives and friends – but it is not something I find particularly rewarding.
I have no more interest in building my own NAS boxes and laptops than I do in building my own home or car. The time required for the DIY approach is not 'free' unless you actually enjoy doing that sort of thing as a hobby. I don't, so, as far as I'm concerned, it's time wasted on doing something boring and irritating instead of time I could be earning doing something fun and rewarding.
Re: @ZFS Fanboys:
Sean, believe me if any of these NAS boxes used ZFS (or BTRFS or the new windows FS) I would buy one at the drop of a hat, but my data is just too important to put to chance. I am glad you backup, but if the data on the disk goes bad, then so do all your backups - the problem is that you don't know that your data is corrupt until it is too late and all your backups have been 'polluted' with bad data.
You can do a freenas setup in about half a day - there are no arcane spells involved at all, a modest investment compared to the immeasurable expense of losing important data or worst still, not knowing that you have lost important data when your NAS box says 'yep, all hunky dory'.