4 posts • joined 26 Dec 2009
Mega-Cloud Vendors Don't Get It
The pattern of outages for the mega-cloud vendors (Amazon, Google, Microsoft, etc.) is disturbing. First there is the outages themselves. The June thunderstorm outage is an excellent case in point. No thunderstorm should take down a hardened data center. Why didn't they switch to diesel 30-45 minutes before the storms were obviously going to hit? A planned switch to diesel is far safer and more reliable than an automated emergency switch. If the planned switch fails, they have 30-45 minutes to take steps to get those diesels up and running or, at least, minimize the impact.
Second, they (after a week) still have not published a cause for last week's outage. I find that particularly curious. There is a definite pattern of poor communications from the vendors during and after outages. In the June incident, I noticed hours between updates and skimping on details. Enterprise-class customers will not suffer this treatment for long.
Third is that in many of these outages, the size of their environment seems to exacerbate the issues, causing such things as replication storms that back-up and cause even more problems. They claim to have availability zones, although some of these outages have affected multiples of them. Maybe they should further subdivide their environment and do a better job of isolating them. I believe they also share management functions across their environment. Maybe they should develop management zones aligned with their ideally smaller availability zones. At least offer this as an option, even if they charge a little more.
To me, this all points out the immaturity of the current cloud landscape and will drive enterprises more towards internal private clouds for the near- and mid-term until the vendors can get their acts together. It might also drive enterprises to the clouds being offered by the traditional outsourcing vendors who understand the enterprise market better.
Time for the cloud industry to do some soul-searching.
IT Financial Management
I consult with enterprise class clients. My general conclusion is that most shops do NOT do a very good job of financial management. I personally believe this is due to technical people not liking financial matters and financial people not understanding technology.
When I perform an IT infrastructure financial assessment, I am often creating something from a bunch of disparate, out-of-date, disjointed sources, to which I always have to make a bunch of assumptions that place risk on the accuracy of my conclusions. I can't even count how many asset registers have large financial records just labeled "IT Equipment" or "Computers", or PC's being depreciated over 5 years. The good news is that I have been running IT infrastructures for 20+ years and have created my own financial management capabilities enough times, that the degree of risk is mitigated. I'm getting pretty good at matching such records to installed dates on hardware inventories. Enough blatant self promotion.
So, I read this article with interest. The concept is phenomenal. I want to see a demo RIGHT NOW!!! Of course, it is not even in BETA yet, so I will have to wait. When I do finally get to see it, I do NOT expect it to be perfect. I just hope that it can begin to help IT shops automate giving IT (and business) management a view towards the true costs of IT services.
BEST OF LUCK. Keep us posted.
The QUEEN goes down more than a mainframe....
Concurrent MicroCode Upgrades
The zSeries is 98% concurrent upgrades. IBM can NEVER dictate when I client can do their upgrades. In the several large environments I have run, I would never do an upgrade (concurrent or not) during normal hours. In addition, most such upgrades need to be checked for compatability with other upgrades AND, more importantly software levels (very time-consuming in most mainframes environments with hundreds of thirdparty software products.
There MAY be extenuating circumstances here, but it definitely sounds just like the article says: HP canned the EDS staff that did the planning and change control for such changes, so it didn't get done, and KBOOM. There is one question I haven't seen asked: if the upgrade had been done, would the outage have been stopped?
- Apple stuns world with rare SEVEN-way split: What does that mean?
- Patch iOS, OS X now: PDFs, JPEGs, URLs, web pages can pwn your kit
- RIP net neutrality? FCC boss mulls 'two-speed internet'
- Special report Reg probe bombshell: How we HACKED mobile voicemail without a PIN
- Sony Xperia Z2: 4K vid, great audio, waterproof ... Oh, and you can make a phone call