Re: Seriously? Did he really said that? With a straight face?
Ok... because there are bad implementations of dedupe out there (lots of them... NetApp being among the worst I've seen), there will always be comments like this.
Let's talk a little about block storage. There are many different levels of lookup for blocks in a storage subsystem. If you look at a tradtional VMware case, there are at least 6 translations, possibly up to 20 for each block access across a network. Adding FibreChannel in-between aggravates the issue quite badly. It adds a lot of latency based on it's 1978 era design (this is no an exaggeration, the SCSI protocol is from 1978). There are many more problems which come into play as well.
Every block oriented storage system which supports any form of resiliency through replication of any sort (which is not an option anymore) has to perform hashing on every single block received. Those hashes must be stored in a database for data protection. For 512-4096 byte blocks, chances are a CRC-32 is suitable for data protection, and for deduplication with a "lazy write cache" it's is also suitable. However, in the case of NetApp for example which is severely broken by design, everything is immediate and there's no special storage for lazy or scheduled dedup.
In a proper dedup system, blocks which have two or more references on a write operation (even if hash matches) will decrease their reference count and a new block will be written to high performance storage (NVMe for example) with a single reference. If there was only one reference, then the block is altered in place and the hash is updated.
Then dedup will run "off-peak" meaning (for example) that if the CPU is under 70%, then the new blocks stored on disk will be compared 1:1 with other blocks with matching hashes and references will be updated only a single copy of the data itself will be maintained. In addition, during this phase, it is possible to lazy compress blocks and migrate to cold-storage (even off-site) or heaven forbid FC SAN storage blocks which are going stale.
Dedup should have absolutely ZERO impact on performance when implemented by engineers who actually have half a brain.
The disadvantage to the system described above is that dedup won't be sexy at trade shows since it might take minutes, hours or more to see the return from the dedup operation.
As for databases, if you're running mainstream SAN (EMC, Hitatchi, 3Par, NetApp), you're absolutely right. You should avoid dedup as much as possible. None of the those companies currently employ the "real brains behind their storage" anymore and they haven't had decent algorithm designers on staff in years. They take a system which works and layer shit upon shit upon shit to sell them. There will be problems using any GOOD storage technologies on those systems.
For database and most modern instance, you should make a move away from block storage oriented systems and focus instead on file servers with proper networking involved. In this case, I would recommend a Gluster cluster (even if you have to run it as VMs) with pNFS or Hyper-V with Windows Storage Spaces Direct. These days, most of the problems with latency and performance are related to forcing too may translations between guest VM and the physical disk. There's also the disgusting SCSI command queuing illness which is something which orders file read and write operations impressively stupidly since NCQ at each point it's processed has no idea what the block structure of the actual disk is. pNFS and SMBv3 are far better suited for modern VM storage than FC and iSCSI can ever be.
That said, there are some scale-out iSCSI solutions which aren't absolutely awful. But scale-out is technically impossible to achieve over FC or NVMe.
P.S. - Dedup in my experience (I write file systems and design hard drive controllers for personal entertainment) shows consistently higher performance and lower latency than the alternative because of the simplicity involved in caching.
P.P.S. - I've been experimenting with technology which is better than dedup as it would instrument guest VMs with a block cache that eliminated all Zero-Block reads and writes at the guest. It improves storage performance more than most other methods... sadly, VMware closes their APIs for storage development, because of this, I have to depend on VMware thin-volumes or FC in-between to implement that technology.
P.P.P.S. - I simply don't see this company doing anything special other than trying to define a new buzz term which is nothing new. Implementing code into the KVM kernel is the same as Microsoft implementing SMB3 into Hyper-V, it's just old hat.