back to article Whoops, my cloud's just gone titsup. Now what?

“We apologise for the disruption. We have identified the cause and are working to restore the service as quickly as possible.” Attempting to log onto your cloud service and being faced with a message like that is guaranteed to strike fear into the heart of anybody that has trusted all or just part of their company's CRM, email …

Thumb Up

Not all bad news then

"The outage even prevented the interactive edition of the Daily Mail - Daily Mail Plus- from appearing."

You make it sound like a bad thing.

22
3
Bronze badge

Re: Not all bad news then

Every cloud has a silver lining.

8
1
Silver badge
Coat

Re: Not all bad news then

Shouldn't that be "every cloud outage has a silver lining"?

3
1

Now a year is not exactly 365 days, but if it were then that would be 525600 minutes. At four nines that allows for an outage of 5256 minutes or 87.6 hours. SLAs calculated on an annual basis are worthless. The same service level would allow for an outage of 7.44 hours before being triggered if worked on a monthly basis, which is more reasonable.

All of the above is of course meaningless if there's no (or trivial) compensation in the event that the service level is breached, which is the case with most SaaS offerings.

One must not however confuse SaaS with cloud. It's quite possible to get a robust infrastructure in the cloud by using two or more infrastructure providers and installing your own business software. That's why SugarCRM is infinitely preferrable to SalesForce. You are in control be it in the cloud or on your own infrastructure.

0
1
Silver badge

"then that would be 525600 minutes"

I think the author meant a 99.99 percent or 9,999 per 10,000. Agreed with the rest of your post, though.

I'd also like to call fellow's commentards attention to the fact that the cloud provider is not the only possible source of downtime for the customer. Screw ups by the telcos probably will cause localized outages more often than the cloud provider does.

5
1
Silver badge

four nines that allows for an outage of 5256 minutes

No, 53 minutes, it's 99.99%. 5-nines is generally taken as no more than 5 minutes downtime per year, or more realistically 1 hour per 10 years, since few people install services for only a year.

A bigger problem is that such a simple calculation only works for a total outage. What if your network is struggling due to, say, a DDoS on the cloud provider, but some traffic is getting through? Or some of your apps are running but some aren't? What number of nines does that give you, and how do you write an SLA for it?

5
1
Silver badge

Cloud

It's all in the name.

3
0
Silver badge
Coat

Re: Cloud

It rained last night. I guess that was the Daily Flail storage washing away then..

mines the one with a copy of the 'I' in the pocket.

0
3
Bronze badge
Thumb Down

you know you've hit the big time when

like sharks circling the scent of blood in the water, lawyers start circling a possible source of lucre.

0
0
Silver badge

Re: you know you've hit the big time when

Logical fallacy.

If lawyers start circling, you *may* have hit it big. It may also be that your destiny was to serve as a warning to others.

0
0
Unhappy

SaaS

A bit like Microsoft Azure crashing out this morning in Northern Europe then!

I just gave up in the end, I think it's back now though!

5
0
Anonymous Coward

Well if it wasn't for these scrounging immigrants and gays adopting children it would never have gone down in the first place and wouldn't be causing people to spontaneously get cancer.

For the daily mail reader this - this comment is meant to be humours and is by no means factual

5
2
Facepalm

"is meant to be humours"

or even meant to be humorous?

Guardian reader by any chance?

6
0
Anonymous Coward

can you even still buy the Guardian?

1
2
Anonymous Coward

Damn can't spell. Thanks.

And I am more of a Telegraph man ta. The mail gives people who sit at the right of centre a bad name however.

0
0
Silver badge

Putin is getting it in both. American-style bi-partisanship?

Then Cameroon comes out like an idiot from hell and is telling tall tales about WWI being about "The Freedoms" but I digress.

0
1
Bronze badge

If it wasn't for...

You forgot to mention single mothers. And any story probably needs a Princess Di (or Kate) angle.

I once bought the Mail on Sunday for a free CD. I felt dirty afterwards though.

1
1
Silver badge

Comparing a cloud outage to a power cut ? Really ?

That's like comparing theft to copyright infringement.

If your problem is power, then what you need is a backup diesel generator (or however many are required to cover your needs). Insert it into the grid, fill it up, put it on standby and you're done, apart from the regular maintenance and trial runs. Frankly, apart from the cost, this is a no-brainer operation (and yet, some still manage to fudge it up anyway).

That is peanuts in price and hassle compared to a cloud outage. Even if you do go for a backup cloud operator (and we're talking big budget operations right there), there will be a boatload of problems to deal with on the spot when (not if) it happens.

There are internal procedures to devise, which will need to be amended after the first live-fire event (because there's always some difficulty that was not taken into account).

There is (company) user training, because said procedures need to be understood and implemented in an urgent situation. There is proper warning and communications, because the switch cannot be made before it can be, and (company) users switching manually on their own willy-nilly is going to create its own special brand of havoc.

There is monitoring that the switch has taken place and that operations are once again in a working state. What are the metrics ? How to measure them in a time of crisis ? How to ensure that all required functions have been taken into account ?

Finally, there is recovering from the outage, and the decisions that need to be taken - mainly do we switch back again, or do we only switch when this cloud fails ? After the first live-fire event, maybe previous policy decisions will be reviewed in light of performance before and after the switch.

Then there will be the accounting fallout, because all of this hoopla will be quantified and cost-assigned, and the next board meeting will be a live-fire event of its own.

No, comparing with a power cut doesn't even begin to do this kind of thing justice. It is a very poor comparison.

6
1

Re: Comparing a cloud outage to a power cut ? Really ?

And unlike a cloud outage, the power will eventually be restored. In the event of the power not being restored, the lack of any computer serive will be the least of our worries.

0
0

Count on it . . .

Despite service providers pushing the reliability of their services, outages are a very likely reality for those using cloud services.

First, there is something called the law of large numbers. Massively parallel systems at state of the art computing centres run to hundreds of thousands to millions of microprocessor cores. Even more astronomical numbers are being discussed for data centers where the goal is capacity to do lots of jobs as opposed to raw throughput.

The presumption of solid state reliability can be seriously questioned.

The state of the art has change dramatically since the term “solid state reliability” became common. Transistor feature sizes and component densities have all changed radically. New materials have introduced new failure mechanisms. These have been well-understood for years:

ITRS http://www.itrs.net/Links/2005itrs/Linked%20Files/2005Files/PIDS/4377atr.pdf

Critical Reliability Challenges for The International Technology Roadmap for Semiconductors (ITRS)

Since then, restrictions on hazardous substances have added a new failure mechanism. Among the unintended consequences of this initiative is the spontaneous crystal formation tin of “whiskers”, that eventually short to some other part of the circuit causing failures.

Bottom line: state-of-the-art microprocessors run 24 x 7 are going to have a limited life. Credible speculation is that this could be as short as a few years. And nobody appears to be seriously thinking about the cost of end-of-life replacement.

The issue is not the probability that there will be a catastrophic meltdown of data centers. The problem is manageable with existing technology if cost to the customer is no option.

The critical issue is that a small handful of large companies are effectively moving to limit the average customers’ options to reliance on large IT services companies all their information management needs.

And then, there's bandwidth . . . a subject for another post.

1
0

Money is an object

Large data centers cost hundreds of millions to billions to construct. At the moment the Cloud has to compete with local alternatives. . . which include my ability to buy a hard drive for more terabytes of data than I can envision using for a few hundred dollars.

This going to make redundancy as a solution to reliability issues a touch challenge. I'm not at all sanguine that at a half billion a pop, industry is going to build excess unused capacity.

Unless, of course, they can contrive to create a virtual monopoly and dependence where they can demand what the traffic will bear.

And then, there's the bandwidth . . .

2
0

About that bandwidth

There is no such thing as a free lunch. The notion of achieving reliability in a flexible cloud is all well and good. There are two problems . . . first, the use of a flexible cloud presumes the existence of redundant unused capacity. Second, it presumes the ability to transfer petabytes of data.

As a fellow commentard wisely noted, the telecom companies have a dog in this fight. Like the data centers, they are in business to make money. They cannot be expected to build large amounts of excess capacity. Unless, of course, they can charge for it.

Bottom Line: There are four major stakeholders in this issue: The folks building the large data centers; the telecom companies; the government (for whom the infrastructure is strategically vital); and the customer. All but the customer have a strong vested interests in forcing the customer to use and pay for Cloud Computing services.

Finally, about that bandwidth: Shannon's "law" is still alive and well. Many of us have had the experience of getting on the Wifi connection at a hotel, only to watch the number of bars shrink as more guests arrive and log on until eventually, only the guests closest to the Wifi transmitter have the signal to noise to get any quality of service.

Now Imagine that on a global scale.

1
0
Bronze badge

wow. so you need DR. thanks for that.

i'll let the 1960s know you have caught up with their ideas.

0
1
Silver badge

No need to be sarky. :)

Some people think all the tech in the cloud is redundent, therefore you don't need a DR site.

They don't always know that its rather like using RAID5 instead of a backup.

The problem is the cloud doesn't scale cheaply. When you push the limits of tech, things get expensive. When you add a third party, things get expensive. When you need serious uptime, things get expensive. When you put all your eggs in one basket, outages become expensive.

A third party has no interest in the value of your application uptime. Therefore, the (cost of) tech used is only really going to be vaguely appropriate.

0
0

Compensation...

Compensation is wonderful. Not only does it do exactly nothing to get your users working again and off your back, but also it means that your vendor is concentrating on minimising the payments rather than getting you

you operational again ASAP

ASAP.

0
0
Bronze badge
Happy

That's not a Cloud it's just Smoke.

So if a Cloud provider goes down their systems were really just smoke and mirrors. You just got smoked. The mirrors are for the fake extra storage/business.

Put two mirrors face to face see space expand!!!

0
0

unrealistic expectations for tec reliability from commercial media firms

Many world citizens assume that large and recognizable corporations like Adobe will surely employ the best Cyber Security and reliability technologies available. This is certainly no so, since Adobe has no history or experience what-so-ever in Internet networking, Computer security, high availabiliy and reliability and therefore probably give less priotity to such matters which are then automatically reflected in whatever technology reliability and security solutions is engaged.

Don't forget, Adobe is a retail grapgics technology firm, nothing more, irrespective of their wealth. Examine their rens od dozens of Adobe Flash fixes just in the past two to three years.

0
0

Head in the clouds....

Anyone who puts critical data/performance needs into someone else's basket will get what they deserve eventually - nothing!

Bean-counters latest version of Citrix el al.....

0
0

Cloudy Future? I don't think so, at least not in the next 2 years...

The only people predicting a rapid take up of the Cloud over the next 2 years are the vendors whom want to give you the impression that Cloud is taking over the world - the truth is quite the opposite. The only winners with for instance Microsoft's Cloud products remain the vendor and the partners / resellers receiving greater incentives. Most businesses (small, medium and large) face increased costs with Cloud over the period of contractual period (compared to perpetual volume) and I would be amazed if Microsoft achieved even 20% Cloud revenue by 2016 given how slow Enterprise customers have been in taking up Office 365 and Azure to date. And you cannot blame businesses for being sceptical - outages and increased costs are just a couple of issues to grapple with - would you want the NSA spying on your company's data?

1
0

So I need an on premise standby to cater for when the cloud, which saves me from on premise data centre, fails?

0
0
Bronze badge

Cloudy places have their uses.

"Cloud" services are good when they're things you don't need live 100% of the time. Like overnight backups. As long as it works 99% of the time, it's not a big deal.

But for anything you need instant access to at random times, the cloud is not it. Think about how many things can break between your keyboard and the cloud provider's hard disks. Add in the number of people who can cock up a config or damage equipment between you and the cloud provider, and the whole deal looks really stupid.

0
0

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Forums