It's so costly when you're the little guy...
Over-provisioning in smaller organisations is not only possible, but a necessity. When working with virtualisation, you still need spare hardware in case a node goes down.
Let me run you through an example, using the infrastructure of the company I work for. We have two types of network. One at each of our production sites, and one at our head office. Head office contains all the usual centralizable things. The production sites contain servers and services that simply can’t be centralised. (No surprises here.)
Our head office manages to cram all of its services into X physical servers. (Somewhere around X*25 VMs on those X servers, but only X physical boxen are active at a time.) For this, we keep a “cold spare” copy of our fastest VM server sitting around on the rack. If something goes boom, we move the VMs over to that server. We also have over-provisioned space on the existing servers so that if a second server should fail, we could absorb the hit by spreading the VMs of the second failed server across the cluster.
Our production sites can fit everything they need onto Y physical servers. To take advantage of a little extra performance that we don’t strictly *require,* but is nice to have, we spread the load to Y*2 physical boxes. Like the head office, we keep a spare around. Again, the spare swaps in for the first failure, and in a pinch we can collapse our sites from Y*2 active physical servers into Y.
Is it ideal? No. But we can’t really afford to be hosting somewhere in the realm of 25 times the number of physical systems in our various micro-datacenters either. So virtualisation was the only option for us. We do not have “big boy” budget. Everything we do is whiteboxed, and we are running ESXi (for lack of funding to purchase VMWare’s management tools.) When we deployed our VM infrastructure, buying SANs of adequate speed (10GB iSCSI or fibrechannel) for head office and all production sites was simply not an option. (Thus we use local storage on the physical server nodes.) This means that yes, if a server goes boom we have to move everything by hand. As a small business this means up to 4 hours downtime (worst case) with an average of 45 minutes any time a physical server vomits up a stick of RAM of drops a disk.
Counting both spares and active systems, we’ve around $N worth of virtualisation server hardware. Rough math says that if we wanted to get all the VMWare management software to run our gear, we’d be asked to give VMWare north of ($N + 30K). (That doesn’t mean $30K in addition to the cost of the hardware. It means the VMWare management software would be $30K more than the cost of the hardware it would be managing!)
45 minutes of downtime every now and again, as well as the cost of a few spare chunks of hardware and a little over-provisioning is something we can live with. The costs of the software to “elegantly” solve virtualisation provisioning issues isn’t.