@ M Burns and @ PG 1
@ M Burns
Raid inside a single disc enclosure is not so good an idea:
- Can't replace one side of a failed mirror and say "rebuild". You need to replace both sides and then presumably copy.
- A failed controller is common and kills the whole RAID set.
However perhaps SSDs will alter what RAID means, given that you have a collection of lots of flash chips rather than two spindles or so.
@ PG 1 ("hash functions have collisions")
Yes if you use a weak hash function. As it happens though, the OpenSolaris ZFS folks have recently been adding de-dup, and the argument is that something like SHA-256 means the collision probability is something like 1 in 2^88 to 2^100, even assuming 2^64 bits of storage in future and 1MB block sizes. this assumes the hash function has very good cryptographic randomness (which most people do for SHA-256).
Since this is on a single disk rather than a whole filesystem, it will take even longer into the future before you can buy 2^64 bits on a single disk - that's about 2 million TB. Although this also means that filesystem-level de-dup across multiple disks will probably give better de-dup performance.
Now, the Solaris folks have added "bitwise verify" as an option for paranoid people, but I think it's not on by default for SHA-256. You may even get a prize if you get a collision out of SHA-256 although clearly it's possible.