Google is developing some sort of back-end technology that automatically - and nearly instantly - redistributes live compute loads when a data center is in danger of overheating. Or maybe this is just talk. Google prefers to at least maintain the illusion of data-center nirvana. During a panel discussion last week at Structure …

COMMENTS

House rules Send corrections

This topic is closed for new posts.

Tuesday 30th June 2009 23:58 GMT Anonymous Coward

Rise of the machines!

Data centres that automatically transfer processes across the globe to avoid local 'failures' - didn't Skynet do that? The end is nigh! :)

0 0
Wednesday 1st July 2009 02:48 GMT Mike007

hu?

is this a complex problem that would take a lot of time and money and effort to solve?

modify the new server install procedure to add an extra step of "enter BIOS Setup, configure overhead shutdown settings", then allow the automatic redundancies they have designed for server failures to handle the rest (perhaps with a modified BIOS that has a wake-on-cool setting as well?)

can i haz huge piles of cash now for being smart enough to solve this complex problem?

0 0
Wednesday 1st July 2009 08:22 GMT The Original Ash

@MIke007

No.

They want to shift workload away from the datacentre BEFORE it hangs, not after. This is supposed to be pre-emptive avoidance, not post-disaster clear-up.

Now, if you could get a reliable temperature monitor for a CPU linked to every box in the datacentre, link it up to the load balancing system, AND have MapReduce redirect tasks from an overheating (but not failed) datacentre to many others across the globe, and THEN when that failing centre is repaired / back to normal operating temperature redirect tasks back to the datacentre, in REAL TIME, *THEN* u can haz moneez.

0 0
Wednesday 1st July 2009 09:39 GMT Anonymous Coward

Been there done that

Look, call me a dinosaur if you will, but the nicer VAXes with VMSclusters in multi-site setups have been doing this kind of multi-datacentre loadbalancing thing since the last ice age, for customers who wanted it. And for customers who want it today, they still can, so long as you're willing to buy Itanium and run VMS (which seems like a small price to pay for the functionality on offer).

I realise most Google architects and El Reg writers and readers weren't born back in that era, and that because VAXes predate the Interwebs they don't actually exist as far as Google architects and many others are concerned, but there's f*** all originality in what they're talking about.

0 0
Wednesday 1st July 2009 10:02 GMT Craig 2

@Mike007

That's exactly what I thought within 30 seconds of reading the artice, they already have a technology for dealing with outright server failure, why not have temp monitoring that gives a "virtual" fail and the existing redundancy picks up the slack.

The devil is probably in the detail however, in providing a graceful fail rather than an actual catastrophic hardware failure that user notice as a glitch in the matrix (eg. A search take 5ms longer than normal :-p)

In any case, it dosen't seem such an amazing revolutionary idea, but that's easy for me to say after the fact, and probably why I don't work for Google. :-)

0 0
Wednesday 1st July 2009 10:02 GMT Trevor 3

@Mike007

Sorry, I have prior art on this here post-it note that I have post dated. I could post it to you but it'd get stuck on the inside of the envelope.

Or another way, (for windows, or possibly using SNMP traps) write a script that fires every 3 seconds, checks the CPU and mobo temperature then if it is above a certain number, use WMI or powershell or something to fail over the node.

The problem is to cool the server quickly enough if all the nodes are under strain. Not a bad idea though.

Because we'll always have Paris

0 0
Wednesday 1st July 2009 10:43 GMT Annihilator

Handy purchase

Could just put a fan on the CPU? Better yet, a temperature controlled fan! Think I have a spare one that I can let them have for a nominal fee.

I know, I know, I'm going.

On a serious note, wouldn't you consider rotating the load in sync with the Earth, always keeping it on the night side? Possibly taking into account seasons if you were feeling flash. Assuming you had a truly global presence of course.

0 0
Wednesday 1st July 2009 10:43 GMT Annihilator

re: Been there done that

Of course data centre load balancing has been done before - they're not talking about that. They're talking about widening the conditions of load balancing. Previously load balancers tend to either be simple round robin balancers or act based on the load already being handled by each data centre.

The ones we're talking about here are able to take into account the temperature/power requirements and balance accordingly. It's just enhancing the load balancing algorithms.

0 0
Wednesday 1st July 2009 12:51 GMT Robert Forsyth

Clock slowdown with temperature rise

A few years back, didn't we have CPUs with clocks that slow as temperature rises.

If used, these processors would process tasks more slowly, and the load balancing would direct jobs away from them.

You just need global load balancing, or so you? Perhaps the load balancing would add a distance cost/weighting and redirect to a neighbour.

With a restartable process like building an index to dynamic web-pages, the requirements are less strict. For applications/services storing user data, then that is a different set of requirements.

0 0
Wednesday 1st July 2009 12:53 GMT Dave Gomm

Trending ?

The big enabler for high flexibility is making the workloads very portable, that's accomplished by virtualisation (which these days is relatively easy).

That statement isn't meant to trivialise the scale of the supporting infrastructure as in order to do this properly I guess you need to have lots of very stanfardised, very responsive, very high capacity and very integrated technology, this is also quite obvious and although not easy, it's not 'magic'

Perhaps the magic ingredient is trending and predictive as oppose to reactive balancing of the workloads, this isn't my specialist area but isn't that the way the electricity companies work ?

0 0
Friday 3rd July 2009 22:54 GMT Crazy Operations Guy

Data centers 101

They could, you know, not over-subscribe their cooling units?

No need to thank me, my consultancy fee is in the post

0 0
Monday 13th July 2009 09:16 GMT SpinMe

Yeah, ok

Am I the only the one that smells bullshit ?

0 0