Re: Still need generators; PUE tricks
UPSs + backup generators + dual power feed/supply for everything is the standard approach if you want to have a battleship-style datacentre that can fight on through disasters - particularly if you are selling space to tenants and have minimal control over their behaviour/system architecture. You sell them a service level, and then you have to maintain it. Their resilience to disasters outside your usual security+power+environment+network obligations is not your problem.
On the other hand, if you are designing a scale-out system that will live across datacentres that you happen to control, you can make all the ducks line up in a different way:
- use multiple sites with diverse power and network feeds
- plan only to ride out short outages at any given site
- have non-redundant power into each rack, and into each server, but diversity in feeds to different racks
- integrate power/network topology + physical placement information into data placement/load balancing algorithms to maintain data redundancy and service availability in the face of failures.
Vertically integrating the hardware, software and hosting of your service means you don't have to pay for double the UPS/generator/power distribution/PSU capacity to achieve service-level redundancy. In this model, most of your servers need maybe a couple of minutes of uptime to ride out small power blips and also let them write out dirty data from RAM. If you treat RAM as nonvolatile, and handle redundant storage at a higher level in your stack, you can use free RAM as write-behind cache and also remove the need for a lot of synchronous filesystem writes, so you get better throughput for write-heavy workloads.
As for hiding the UPS inside the server just hiding the effective PUE of the UPS, consider that instead of building a big, easily serviceable AC->DC->AC UPS that will keep the fussiest of servers running, you get to look at the PSU schematics and build the simplest AC->DC UPS that will suffice to keep that specific PSU's outputs within spec. That's got to help a bit.