Hotter than anticipated
Microsoft has admitted a dodgy firmware upgrade cooked its servers and knocked its Hotmail and Outlook.com email services offline for 16 hours. In a postmortem examination of the disaster, the Windows 8 giant said a software upgrade for its data centre equipment - an update that had worked successfully in the past - failed …
Hotter than anticipated
But not if its the actual cloud thats been turned to steam...
Yeah, why can't we have good old fashioned Internet services with no data centres and no temperature contro...WAIT A SECOND!
“There was a mix of infrastructure software and human intervention that was needed to bring the core infrastructure back online."
I always get suspicious when the word "infrastructure" is used twice in one sentence. Add the phrase "Cloud based services" and it triggers a jargon alert.
Doesn't that imply that they rolled out the same upgrade to multiple datacenters at the same time, cooking all the hosts in multiple DCs my mailbox is replicated on? At/near the same time?
Unless they're admitting my mailbox is only protected with inter-Datacentre replication.. which is hardly in the spirit of disaster-recovery. Whichever way it falls, both are a bit of a fail...
(Assuming they use replication for protection at all.. can't imagine someone is there changing tapes for all this constantly.....)
Just because there was no obvious service redundancy it does not mean that there is no disaster recovery measures in place.
And manual changes of tapes? Even if they did use tapes it wouldn't be manual.
"These safeguards prevented access to mailboxes housed on these servers and also prevented any other pieces of our infrastructure to automatically failover and allow continued access."
So the "safeguards" prevented the failover components from doing the very job they were built for? That sounds like and absolutely cracking piece of design...
One of the PSU for a camera on the Hubble space telescope had a fuse.
The fuse was required by aerospace regs to safegaurd the instrument and so prevent loss of service. It was pointed out at the design review that manual intervention to reset the fuse probably wasn't an optimal solution.
It's not a failover if, when something fails, the failover doesn't take over.
And what if the datacentre itself had gone boom because of the heating problem? Is that the only place with that data? Is that the only place running the necessary services?
Again, that's NOT failover.
Surely it's the very definition of the perfect failover.
The service failed.
Then the servers fell over.
A big pool of fail - all over the floor
On the other side of the country, fully replicated in as near real time as they can make it, with a site to site fail over procedure that should take place in no more than a few minutes?
If Microsoft wants to compete with Google, they will have to do better.
Nice headline BTW. I didn't think much of the sub heading though, so not exactly a nice pair.
They said there was a manual intervention required.
In this case that was sending somebody down to bestbuy to buy a new Packard Bell, install windows and hope they could pull the drives out of the original servers
"They said there was a manual intervention required."
Well, on a major site to site fail over there usually is. With all the re-pointing of various types of virtual IP and network routers, any automatic part would rarely go 100% well, hence the requirement for judicious manual intervention. I suspect Google are probably the best at achieving this level of resilience - they seem to think carrier grade while their competition has an enterprise mentality.
"Sweating like a Russian submariner."
I'll get my coat.
Microsoft tried hard to convince users that you'll get double-team Scroogled by the Gmail Man and the Googlighting Stranger.
Turns out Microsoft scored an own goal. It Microshafted itself.
Imagine Microsoft using all that money for negative ads bashing Google to improve its own products instead.
1 - yes
2 - yes
3 - Oh come on now, stop being so unrealistic!
Just found that the bastards have converted one of my accounts to Mickey-Mouse+ interface. Seems very reluctant to allow you to log off.
How's that "automated server farm" thing working for ya?
As soon as I saw this:
De Haan’s analysis continued in torturously overworked prose to explain that, in non-technical terms, some techies tried turning it off and back on again to fix it.
I knew that his post was nothing more than your typical corporate spoon-fed bullshit, run past the shysters in order to deflect blame and accountability.
I had not even noticed - although my Hotmail inbox is quiet 98% of the time anyway
Mine kept working just fine. I guess maybe this was a regional issue.
And my outlook-hotmail connector has been offline ever since.
I'm running an old version, so the suggested solution is to re-install it.
Or, I guess, to de-install it.
And ... if the same firmware worked before ... why not now ...
Who is the IDIOT that put that release together ...
by the summer. Hotmail is still the world's largest free email service.» According to a blog posted by Google SVP Sundar Pichai on 28 June 2012, «Gmail, which launched in 2004, has evolved from a simple email service to the primary mode of communication for more than 425 million active users globally». As far as I know 425 million > 360 million ; does the Reg disagree with my maths or does it dispute Mr Pichai's figures ?...