Yahoo! Japan, the partnership between American search engine and internet media giant Yahoo! and Japanese telecom and internet service provider Softbank, have let server maker Fujitsu brag a little bit about some modified servers it has cooked up for the company. And we mean a little bit. The details are a bit sketchy and …
Still need generators; PUE tricks
Must be awfully power-dense batteries in those power supplies to run servers for several DAYS of power outage, which is what generators are usually good for. Batteries in power supplies allow you to avoid external UPSes, but you still need generators.
Essentially moving the UPS inside of the server allows you to improve PUE in a tricky way. The power loss through the UPS goes away. Similar power loss is hidden by maintaining charge on the batteries inside each server, if you measure PUE at the server power cord. Nice trick!
Re: Still need generators; PUE tricks
Depending on what the facility is doing you could sometimes get away w/o generators and just shut down (data/apps would of course have to be transparently available at another facility, something I'd wager yahoo has been doing for some time now). Or perhaps a compromise, some subset of systems operating in this mode, while other more mission critical stuff backed by real batteries and generators.
One interesting stat I heard a few years ago at a Datacenter dynamics conference, a speaker from the "Electric Power Research Institute" said something like 90% of power events last less than 2 seconds (may of been higher than 90% I didn't take notes at the time though did blog about it so I referred to that as my memory in this case).
Obviously not suitable for more traditional shops.. since this level of application/data center availability is still very rare outside of the biggest global internet brands.
Re: Still need generators; PUE tricks
UPSs + backup generators + dual power feed/supply for everything is the standard approach if you want to have a battleship-style datacentre that can fight on through disasters - particularly if you are selling space to tenants and have minimal control over their behaviour/system architecture. You sell them a service level, and then you have to maintain it. Their resilience to disasters outside your usual security+power+environment+network obligations is not your problem.
On the other hand, if you are designing a scale-out system that will live across datacentres that you happen to control, you can make all the ducks line up in a different way:
- use multiple sites with diverse power and network feeds
- plan only to ride out short outages at any given site
- have non-redundant power into each rack, and into each server, but diversity in feeds to different racks
- integrate power/network topology + physical placement information into data placement/load balancing algorithms to maintain data redundancy and service availability in the face of failures.
Vertically integrating the hardware, software and hosting of your service means you don't have to pay for double the UPS/generator/power distribution/PSU capacity to achieve service-level redundancy. In this model, most of your servers need maybe a couple of minutes of uptime to ride out small power blips and also let them write out dirty data from RAM. If you treat RAM as nonvolatile, and handle redundant storage at a higher level in your stack, you can use free RAM as write-behind cache and also remove the need for a lot of synchronous filesystem writes, so you get better throughput for write-heavy workloads.
As for hiding the UPS inside the server just hiding the effective PUE of the UPS, consider that instead of building a big, easily serviceable AC->DC->AC UPS that will keep the fussiest of servers running, you get to look at the PSU schematics and build the simplest AC->DC UPS that will suffice to keep that specific PSU's outputs within spec. That's got to help a bit.
What this does mean is instead of large battery rooms etc, the onboard battery should cover the power event that triggers the generators... everyone knows that DataCenter space is a premium so if your main battery room can be reduced by 50% because each server has a small UPS built in that lasts say... 15 minutes then the battery room is only needed for switch gear and other things but it can be greatly reduced.
Just hope the batteries these servers are hot swappable and they have 2 inside just in case you have to swap a battery and the power goes out, there is 1 there to sustain it for a bit
I'd guess that they handle this in other ways - make another copy of the data on the server, and/or take it out of the load balancing pool before swapping the battery. Paying for redundant hardware on every node to reduce a rare failure mode is the kind of thing the huge scale companies are trying to avoid where possible.
Re: data centre space is at a premium
If data centre space is at a premium, the space probably shouldn't be used as a data centre...
Data centres on the scale of those used by Yahoo are always limited by power - either by the amount of power you can deliver to a rack or the amount of cooling you can deliver. You can almost always use more power to deliver more cooling to a rack.
Regarding redundancy, I assume these servers are similar in function to those used by Google, Amazon and Facebook - if something goes bang, another unit takes over the work package and the unit is replaced. While they will be swappable, they probably don't even need to be hot swappable.
Unless I am wrong, the story didn't say there weren't generators in the equation, just that such a configuration would allow them to avoid them. Of course, with a geographically distributed load (as I am sure Yahoo! has) extended uptime during power outages is less relevant so you probably could do without them.
Even if generators were still used, surely there is a worthwhile saving anyway?
I mean, removing generators from the equation, you have AC > DC > AC > DC - with the UPS converting and incoming AC to DC for the batteries then outputting AC which gets fed into the PSUs which then converts back to DC.
With a directly-attached battery solution, you just have AC > DC - the line power goes into the power supply, is converted to DC which feeds the batteries which in turn feed the servers. (Or in parallel - whatever.)
Over the scope of an entire, Yahoo!-sized data centre, that's likely to be a significant saving.
RE: Yahoo dropping UPS systems
It seems that you, and Jason Ozolins above "get it".
By removing one set of AC -> DC conversions in the power chain, you can improve the overall efficiency; and in a large data center, that can add up. If you use server designs that can take 12 volts DC, and, ON THE BOARD create the necessary 5 volt and 3.3 (or lower) voltages for the ICs, you reduce PSU complexity (essentially an integrated PSU). All in one neat package.
Batteries are heavy and dangerous. They should NOT be near your server... not to say INSIDE it.
Also, batteries need checks, maintenance and replacement..
I'm all for a "compact" datacenter, but I don't think this is a good idea.. can anyone with more datacenter experience say something about these issues?
They are using NiMH batteries - these are pretty safe and long lasting - certainly compared to most li-ion cells (although there are lithium ion variants that are considered pretty save - LiFEPO4 for example) - but NiMH is tried and tested and cheap...