See what happens
When automatic updates are left enabled?
Four days have passed since a "procedural operations error" downed Azure SQL Reporting in Microsoft's East US data center, and Redmond is still trying to restore customer data. After saying on Thursday that full restoration from Monday's fail would have occurred by Friday, the recovery date has slipped again, according to …
Is that the (IMO: self-proclaimed) "experts" on El Reg's chat this afternoon (regarding Win Server 2012) were quite specific when it came to Azure and its development model. I quote:
"Alun Rogers: Maybe this is more of a joining up with Azure release schedules? They iterate that like crazy"
And also:
"MJF: I think this move helps put MSFT's "Cloud OS" campaign into more context. The idea is Win Server should be the best OS for building/supporting cloud services. If that's true, it needs to be evolving in lockstep with Azure, not lagging it feature-wise for years at a time"
The topic at hand during these quotes was "Blue"; the assumed new setup where Microsoft pushes out new software more or less continuously (and on a subscription bases I might add) instead of releasing major versions.
It would appear as if both guests had no clue - what so ever - regarding the current dire state of Azure, something which IMO speaks for itself when reading the quotes. I shudder at the thought of a Windows server being released / developed at a quick pase as Azure, not merely for the current state Azure is in, but the idea in general scares me considering how Microsoft still firmly holds the reputation that a v1.0 release is usually filled with bugs and other nasty stuff.
Yet these two seemed to think that it would be the most ideal situation for customers to be in. Are you kidding me?
Now, this maybe a little bit of a cheap shot on my part, sure, but now taking the current state Azure is in into mind combined with the fact that they haven't been able to fix this within a whole week I'd say its safe to conclude that hasty releases on Enterprise level aren't the brightest of ideas.
Of course I'm no expert :-)
The story of my life with Microsoft.
"No data was harmed in the incident, but getting it back is taking a long time."
Well actually the data was totally fucked up, and getting it back is taking even longer than a long time... but yes, the statement is mostly correct.
Linux = Reliability.
Linux does not = Reliability any more than Windows = Reliability. Any sys admin worth their salt will tell you that things are a lot more complex than that. The issue is that they cant recover quickly form an outage and that is usually more to do with their operational procedures and safeguards than the particular OS. However its hurting their customers and thats bad for business.
No system is perfect. There is this idea that the cloud is magic and infallible. Its still hardware and software and many meshed systems its just sitting in someone else's facility. There are very few cloud providers that haven't had some issue that impacted customers. This happens in the enterprise and unless it impacts external customers you dont get to hear about it. Enterprises dont air their dirty linen in public unless they have to.
There is a cultural thing where the new cloud web 2.0 kids seem to think that just because you stamp cloud on a service that you forget all of the lesson learned by operational teams over the last 30 years. The basis tenants of keeping things running remain the same. As much fun as it is to poke fun at MS, Amazon, Rack space , IBM or HP it underlines the issue that no large environment is free from outages.
The best operational teams understand this and have good plans in place for recovery. They assume something will break and have factored this into their business and operational process so when something does go wrong they can recover quickly. The issue here is not that something went bang its that its taken them over a week to fix it.
And they're still not done fixing it.
Now you can generalize all you want, saying that cloud structures are all at risk etc. etc. You're not inherently wrong, you're just forgetting that Amazon and Google are managing much larger volumes of data and are doing so much more efficiently than Azure is apparently capable of.
Microsoft is building itself a history of failing major products in an embarrassingly public way. Some failures can be explained by market conditions, but this Azure failure is a technical one, and that is a stain that will simply not go away.
We all know how this is going to end. The service will be restored, Microsoft will triumphantly tout the excellence of its platform that lost no data, and the weeks it took to get to that point will be smothered under a pile of pillows. For Microsoft, this will be a success story.
For everyone else, this will be the baseline for Azure reliability : when it fails (and it will), it takes X weeks to get back online. As said in previous comments, Azure is already at a reliability rating of one 9, and that is a failure in any administrator's book.
I'm not defending MS , if it was my business being affected Id be pretty pissed. I dont think even MS PR department could spin this one out a success story. My point was that the recovery time is unacceptable an that adequate recovery procedures weren't in place or it would be sorted. MS have again shot themselves in the foot.
If they do then Apple will take their place. People like commercial software of a good quality.
Free software of a hugely unpredictable and varying quality isn't what the masses want. People buy things these days, the days of building your own stuff is long gone.
People generally don't want free software any more than people want to knit their own jumper or make their own clothes.
The reason Azure is flaky, is due to rushed implementation, backup process failures and a general slap dash approach to cloud by Microsoft!.
Build your Linux and Windows on VMware Cloud Director accompanied by robust DR concepts and and a more professional attitude to delivering Cloud properly and the situation will be much more robust if any of the componenets fail.
The whole Azure, Hyper-V, Windows Server 2012, SCCM is a rushed coded cobbled together ecosystem that is going to burn all the MS shop "follow like sheep" fanbois!. This will not be the last time customers will suffer long downtimes in all type (Azure, Hyprid, Public, Private) MS cloud mash ups!.
Pay the money and you get reliability. Its a no-brainer.
Hyper-V, Windows Server and SCCM are all very well proven and at least a match for the best of the competition. And don't forget SCOM, SCSM, Orchestration Manager, App Manager, etc...
Azure is relatively recent and is still introducing new features, but has a better reliability record than some of it's largest rivals.