back to article Dragons and butterflies: The chaos of other people's clouds

Cloud computing was meant to solve the reliability problem, but in practice, it still has a long way to go. Is that an endemic problem with the complexity of cloud computing, or a problem with the way people use it? Cloud infrastructures are meant to be resilient, because they tend to use lots of cheap servers and scale out. …

  1. 2460 Something

    I don't understand why people are that surprised though. Everything still runs on hardware, just not stuff that you own, and you now have no ability to get anything working in the case it gets borked. You just have to wait (a lot of time with very little information as to what caused it to break and when it will be fixed).

    Yes there are benefits to using somebody else's kit but you need to understand the pitfalls with this approach as well. If your company is happy to sign on the dotted line fully understanding this then go for it. I suspect that most have been lulled into a false sense of security by salesmen though, didn't listen to the in house IT (well why did we pay a consultant all that money if not to ignore everyone else) and spent a load of money on 'the cloud' as that's what everyone else is doing!

    So now we are in a situation where you have ditched all your local hardware and are entirely reliant on 'cloud' offerings. But that doesn't remove the need to have redundancy in your service providers as well. Netflix are a great example of planning and doing it right. But most companies don't do it to this level and so when it goes wrong it goes horribly wrong. There must be a point though when the cost model swings back the other way, even more for those who have been burnt already, and companies start setting up their own hardware estates again.

  2. Mage Silver badge
    Devil

    Elephants ...

    1) Increasing software mono-culture. A patch on ANYTHING, even an AV update, might bork several different providers completely.

    2) The local connection to the cloud.

    What if we go cashless, all plastic cards and/or phones and banks, retailer's back offices and payment providers all outsource?

    There would be food riots in maybe 24 hrs (c.f. Hurricane Katrina).

    Most people will die.

    What if infrastructure providers outsource to cloud? If the billing system is inaccessible, you'll not be able to use your mobile even if the Mobile Network is operational.

    Electricity, gas, pumping water and sewerage.

    This Cloud business has no silver lining.

    It's only safe as backup systems to your own systems or for anything you can life without for a week, or collaboration on documents. If your business depends on it, some day you'll lose everything.

  3. Ken Moorhouse Silver badge

    This is one side of the coin

    This is the conventional one of considering every part of the system is committed to "doing the best" for the resilience and well-being of the overall system. If that requires cutting resources out of the equation, then so be it.

    The other side of the coin however, is that there are people out there who will be trying to bring down the system. They will be looking at injecting stimuli into the system intended to cause an imbalance, bringing the whole system down.

    DoS is a brute force means to do that, but here we are talking an extra level of sophistication: tweaking things in a low-volume way that amplifies to make it look as if a failure is imminent - the system recognising this as a cascade situation and taking what it thinks is a necessary action.

    A good analogy for this behaviour is systems used in the Financial Markets - I believe there have been instances where people have gamed the stock market to cause artificial price crashes. The counter to that is to bring in random elements into the mix, so that there is no hard trigger-point for defensive action to be taken. I feel that there is no iron-clad way to cater for all eventualities.

    1. allthecoolshortnamesweretaken

      Re: This is one side of the coin

      Good point(s). I'd like to add one: in addition to the people out there who want to bring the system down, there will be all sorts of additional problems once all that IoT crap scales up. It's not just that your backup to the cloud will have to share bandwith with the toaster oven in the cafeteria. At least some of the IoT devices will have their connectivity software and protocols implemented so poorly that they will bork up things big time.

      1. Destroy All Monsters Silver badge
        Terminator

        Re: This is one side of the coin

        There will have to be active countermeasure against rogue IoT devices. Dropping whole sections off the Net for open-ended intervals would just be the first step. One would have to dispatch legions of synsects to rip the little devils out of the walls and crevices into which they have been installed. It may come to the point that one would have to cauterize whole areas with neutron bombs, as these devices may well harvest energy and maybe replica-building materials from the environment and be straightforward pests controlled by Artificial Reflexigence - and damn the conapt-dwellers in the affected areas.

        Better brew some strong coffee, this future is going places.

  4. John Stoffel

    It hard to make the case to management to spend the moolah...

    This is all obvious stuff, and while I love to point out Netflix and it's Chaos Army of testing tools, it's just not valid for where I work. One, management doesn't want to spend money on anything that might (not will, might) save their bacon. Unles they've had their feet in the fire, then they won't spend it.

    Two, for large Oracle installs, unless you get RAC or some other way to horizonatally scale your ERP system, it's just a nightmware to make work. The vendors don't support it out of the box, or if they do, it's so much $$$ that management just says no. Until the day something craps out and they can't book orders, then the spigots open for a small interval.

    Three, getting management permission to *test* your DR, backups, etc is all a fairy tale. They either want it tested without any impact tothe business, or the test is so laughably basic and error prone that it's not even worth it.

    Raise your hands if management considers the development copy of production to be thier DR copy? I see a forest cropping up. And of course the problem is that things *usually* work just fine, and generally we keep things going. Even when we know we're on the knife edge of falling off the cliff.

    It's not an easy problem at all. And it costs money. Lots of it usually. And sometimes its not worth it. But it usually is....

    1. Mage Silver badge

      Re: Netflex

      Unlike Mobile, payments at supermarkets etc, no-one dies if Netflix goes down. Unfortunately the way "bean counters" do stuff and how things work, Netflix is vulnerable to losing customers to boxed discs, Pay TV, YouTube etc, or people making their own amusements. Customers of banks, payment providers etc can't change so easily, thus bean counters don't care to spend extra to avoid a day or two downtime. They don't appreciate what will happen if no-one can make phone calls and no-one can pay for food to two days ...

      1. Mephistro
        Facepalm

        Re: Netflex (@ Mage)

        Both this article and your comment fit nicely with the plans of several govts. to make plastic money compulsory. It proves that bean counters and politicians either are retarded or just don't give a shit.

        I've a bad feeling about this. It's like watching a train wreck in slo-mo!.

        1. Anonymous Coward
          Anonymous Coward

          Re: Netflex (@ Mage)

          Thats because removing cash is a pre-requisite to "real" negative interest rates that suck money out of people's savings, pensions etc as part of the current progressive march back to feudalism - those in charge reckon they'll need as much control over the populace as they can get at that point.

          ( that cutting interest rates from around 5% to less than 1% has produced no lasting benefit to national economies is taken to mean that they need to be cut much further )

        2. Mage Silver badge

          Re: Netflex (@ Mage)

          It's very depressing.

          That's before anyone mentioned any motive of plastic vs negative Interest. I think the motivation though is partly all that cash (notes and coins) in our pockets is "doing nothing", but if it's in the system instead, then it can be lent out.

          1. Mephistro
            Unhappy

            Re: Netflex (@ Mage)

            That's probably the motivation. I wonder, what kind of inflation rates would this cause?

            We commoners like to own things? That's about to end!

          2. allthecoolshortnamesweretaken

            Re: Netflex (@ Mage)

            Plus "it takes money to make money" has a very real meaning when we're talking printing bills and minting coins and distributing them and so on.

        3. Destroy All Monsters Silver badge

          Re: Netflex (@ Mage)

          I've a bad feeling about this. It's like watching a train wreck in slo-mo!.

          After the catastrophe of WWI, we let socialistic basket cases and totalitarian bureaucracies spring up all over Europe.

          They are now mushrooming again, armed with computers.

          This will be the end.

  5. Pascal Monett Silver badge

    "built in redundancy at the application layer because that's best practice"

    No, Netflix created their software and built in redundancy because upper management had made the decision to put the money in it, probably because they decided, after analysis, that it would cost them more in the long run if they had a service that fell over every other week.

    Best practice is hardly the reason for doing things when the accountants hold the purse strings.

  6. allthecoolshortnamesweretaken

    Oh, BTW: I like the term 'cloutage'.

    Might as well coin one now, as we will be needing one anyway. Quite often.

  7. Doctor Syntax Silver badge

    "Oh, BTW: I like the term 'cloutage'."

    For some reason I read that & think "clownage".

  8. Anonymous Coward
    Anonymous Coward

    Cloutage

    Already seen used before as a measure of how much clout or impact something or someone has!

    Clotage may be better - cloud driven by/adopted by unthinking Clots (in too many cases) then having to live with unthought-of consequences when all goes TITSUP!

  9. allthecoolshortnamesweretaken

    If you don't want to be at home for Mr. Cock Up you'll need a cunning plan.

    1. Anonymous Coward
      Anonymous Coward

      Just butter up and enter the house rearwards.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like