Re: Not entirely accurate
> How about having 2 nodes fail in a Simplivity federation? Also down.
Not a federation. Federations are made up of multiple highly available clusters.
2+ nodes in a cluster - will make the VMs on those two nodes that aren't also stored on other nodes become unavailable.
> Then a 2 node Federation with Simplivity also has a 50 % storage overhead since everything is mirrored between 2 nodes
Technically each VM is mirrored to two nodes, so in a 3+ node cluster VMs will be spread across multiple nodes with multiple copies of data across multiple nodes.
All data, regardless, is compressed. Most customer see at least 1.5:1 compression (50%) so actually mirroring is 'free'. Sure, there will be workloads where storage can't compress as well (e.g. video archives) but all systems that erasure code will be less efficient than a shared-storage array such as an E-Series or VNX.
> but Springpath could leverage Erasure coding in the future and bring the RF overhead way down, just like Nutanix can.
Absolutely. And SimpliVity could too (instead of replicating to two nodes, split data over multiple nodes). But 'in the future' everything will do everything, so doesn't really matter right now.
> Then the overhead. Nice that Simplivity only uses 4vcpu's (due to its special dedup card) how about memory overhead?
Typically 2 vCPU; only 4 vCPU when under maximum storage load while also replicating, doing garbage collection etc etc. But regardless, I agree, CPU is mostly less important than memory/storage.
> Simplivity memory overhead is huge and memory is way more important to VM density than CPU
All systems are sized with memory overheads built into the configuration.
Large systems can take up to 1.5TB of RAM and even on the largest system we're only reserving about 101GB of RAM (and average 2 vCPU), so that's still nearly 1.4TB of free RAM for VMs. Should be enough for most workloads.
BTW - all in-line dedupe systems need RAM. This is because you have to store the hash metadata table somewhere, and if it's on SSD you'd incur read I/O for each hash lookup for each (potential) write.
Example: https://wiki.freebsd.org/ZFSTuningGuide#Deduplication - for 20TB of disk (about the same as a large SimpliVity box), will need around 100GB of RAM.
Other systems may reduce RAM requirements by only deduping portions of data E.g. Nutanix: http://nutanixbible.com/ says 'As of 4.5 this has increased to 24GB due to higher metadata efficiencies.'
- so might be great for VDI where you may have lots of the same Windows boot disk of 24GB or so, but probably less useful for other workloads.
If you can think of a way to solve the in-line metadata RAM requirement while still maintaining performance for hash table lookups, patent it quickly and start discussions with all the storage vendors since it'll be quite valuable tech.