@A.C
So you did a filesystem check of 230TB in 90 minutes? Then you check 2.55 TB/minute = 42.5 GB/sec.
Say that a disk checks 50MB/sec in practice. Then you need 850 disks to achieve 42.5 GB/sec.
Did you really had 850 disks in racks? How many racks did you have with disks? Holey Moley! Entire rooms were full of disks? How many rooms?
According to any SAS Enterprise disk spec sheet, such a disk encounters 1 irrecoverable error on every 10^16 bit read. So, if you have enough bits, you will face irrecoverable bit errors. Bit Rot, and such stuff. So if you have 850 disks, then you have a lot of bit rot and flipped bits on random. That is why you use ECC RAM, because bits are flipped on random in RAM. The same thing happens on disks: bits flip on random. And guess what: such errors are not even detectable sometimes. Hardware raid can not detect, nor repair such errors. The more disks you have, the more bit rot there will be, and then you need to protect against bit rot.
