back to article AWS's S3 outage was so bad Amazon couldn't get into its own dashboard to warn the world

Tuesday's Amazon Web Services mega-outage knocked offline not only websites big and small, by yanking away their backend storage, but also knackered apps and Internet of Things gadgets relying on the technology. In fact, the five-hour breakdown was so bad, Amazon couldn't even update its own AWS status dashboard: its red …

  1. From the States
    FAIL

    Major irony alert

    "Ironically, outage monitoring sites DownDetector and isitdownrightnow.com were also offline, thanks to the issue."

    1. Anonymous Coward
      Anonymous Coward

      Re: Major irony alert

      They might need to change the domain name to "itisdownrightnow.com". ;) FYI, isup.me failed over to AWS UK infrastructure, so I used it to monitor the half-dozen other outage nonitoring sites that relief on a single S3 bucket region. ;P

  2. Phil Kingston Silver badge

    Maybe I'm having a sense of humour failure, but those quoted Tweets can't be serious? Someone complaining they couldn't change their mouse sensitivity? Or turn off their oven?

    I fear for this generation.

    1. donjaxon
      Facepalm

      I'm also interested in the veracity of these claims. Razer's cloudy settings are an abomination. My familial Razer Deathadder mpuse owner is out with his girlfriend, so I'm sitting here alone trying to disprove it.

    2. Mad Chaz

      The generation isn't as much the problem as the idiots who built those items to not work without internet connectivity.

      Yea, my ISP went tits up so I can't bake a cake. WTF?

    3. frank ly Silver badge

      @Phil Kingston

      My Wacom Intuos tablet had a driver update (on Windows) that totally changed the settings GUI and gave me a cloud based parameters storage and loading 'facility'. The idea is that I can more easily manage its settings and have them backed up in case of accidents and also, of course, easily migrate my tablet between different computers.

      I uninstalled the drivers and reinstalled from the CD then blocked Wacom software at the firewall.

      1. GerryMC

        Re: @Phil Kingston

        We used to have these things called ini files (or config files or whatever) that you could COPY from one machine to another. Later on, ou could even put them onto cloudy storage if you needed them elsewhere and couldn't be arsed carrying a thumb drive.

      2. Sykowasp

        Re: @Phil Kingston

        Indeed this 'instead of' methodology that many IoT thingies have is a serious worry.

        The IoT aspects should be a layer on top of, not instead of, a working self-contained system.

        App settings - sure, back them up so you have them preserved or migratable to a new system. But don't have them as the sole storage for settings.

    4. Rich 11 Silver badge

      AWS goes down. So does my TV remote, my light controller, even my front gate.

      Rightly or wrongly, all I can think when I see this is 'Serves you right.'

      1. DropBear Silver badge

        "Serves you right."

        I can't help but think you're being a bit harsh here - for instance, ever since the Dawn of Time, if you don't feel like using 156345346 different remotes the only universal ones that are sold anywhere are basically the Logitech Harmony series; and yes, they come with a cloud-only config tool, whether you like it or not. Yes, they do _work_ without being online*, but they cannot be reconfigured. Believe me, I would never willingly chose such a setup but what choice do I really have...?

        * I have no idea whether a similar issue is at hand in this particular case, or if this remote wasn't working at all...

        1. The Original Steve

          It's been a few years since I had / used my Logitech Harmony remote, but back then the tool to configure the remote is online as you say. However, who is changing macros on their remote on a daily basis? Once it's setup you normally only need to change the config when adding a new device.

          Agree with the overall sentiment, but as the Harmony remotes rely on an enormous database of known devices it sort of makes sense for it to be online. (As the database gets updated daily)

    5. Doctor Syntax Silver badge

      " Someone complaining they couldn't change their mouse sensitivity?"

      From the Razer site:

      Razer Synapse is our unified configuration software that allows you to rebind controls or assign macros to any of your Razer peripherals and saves all your settings automatically to the cloud. No more tedious device configurations when you arrive at LAN parties or tourneys, as you can pull them from the cloud, and get owning right away.

    6. Wensleydale Cheese Silver badge

      All together now...

      Told You So!

  3. fidodogbreath Silver badge

    The last voice command ever

    "Alexa, turn off all the servers."

    1. gryphon

      Re: The last voice command ever

      Alexa had her own problems.

      Time, weather reports and radio were fine but couldn't actually play music.

      She'd read back to show she'd understood but then nothing.

      Silly Amazon.

      1. G0HJQ

        Re: The last voice command ever

        I had exactly the same, which seems odd if it's only one region in the US that failed. Do they not have local copies of Amazon Prime music in the UK?

        1. 2+2=5 Silver badge

          Re: The last voice command ever

          > Do they not have local copies of Amazon Prime music in the UK?

          Don't be silly. If they stored anything here they might have to pay tax. ;-)

      2. DropBear Silver badge
        Trollface

        Re: The last voice command ever

        "She'd read back to show she'd understood but then nothing."

        Awww, people today have no sense of humour. It should have started singing "Daisy, daisy..."

      3. Anonymous Coward
        Anonymous Coward

        Re: The last voice command ever

        Silly Amazon?

        No, Silly you.

        Why did you waste your money on it in the first place?

      4. Pliny the Whiner

        Re: The last voice command ever

        "If you want a thing done well, get a couple of old broads to do it."

        -- Bette Davis

        If neither of those broads is named Alexa. I just now asked my Echo if it was all right, and it responded, "Great! I'm ready to stroke your man-parts or whatever!" Um, okay.

        I think I speak for all of us when I say that my sympathies go to the poor soul who couldn't order a coffee. OH. MY. GOD. S/he's probably still shaken by the experience.

        There's a reason why this nonstop idiocy is called a "first-world problem." The reference isn't meant to be self-congratulatory or a compliment.

    2. Oh Homer
      Alien

      Big Bang 2.0

      Alexa, reboot the universe!

  4. Pascal Monett Silver badge
    Flame

    I love the future

    This is great : a cloud service falls down so hard it can't even notify customers it is down. And, way down the line, thermostats can no longer be changed, mouse settings are frozen and God knows what else.

    This is absolutely perfect and should happen a lot more often until people finally get fed up and demand things that work ALL THE DAMN TIME, like they used to before this happy-happy age of sharing everything with the NSA whether you want to or not.

    IoT ? Not while I still have a functioning brain, thank you very much. My light switch does not depend on the Internet and never will.

    1. sorry, what?

      Re: I love the future

      I am rather assuming that IoT is an abbreviation for "IdioT"... right?

      1. DropBear Silver badge
        Trollface

        Re: I love the future

        So, um, if I tell you there is a Python IoT home automation controller called "idiotic", will you be surprised...?

    2. 0765794e08
      Flame

      Re: I love the future

      “IoT ? Not while I still have a functioning brain, thank you very much”

      I agree with the sentiment 100%. However I fear IoT will be foisted onto the unsuspecting public, in various guises, whether they want it or not. ‘Smart’ meters being a prime example, which are presently being aggressively deployed by energy companies in the UK.

      1. Anonymous Coward
        Anonymous Coward

        Re: I love the future

        "... ‘Smart’ meters being a prime example, which are presently being aggressively deployed by energy companies in the UK."

        Not agressively enough !!!

        I have been with 2 Electricity/Gas Suppliers that have 'shouted' from the rooftops their wonderful 'Smart' meter functionality ..... only to be told that they could not fit a 'Smart' meter due to the existence of Solar Panels (new install covered by FITS) or not available in your area !!!

        How can you design a Smart Meter that is unable to cope with self-generation of power when they are everywhere in the UK and not exactly leading edge Technology.

        The 'Not available in your area' was only discovered AFTER I had changed Supplier !!! :(

        [I was assured that I could get a Smart Meter when I queried about it and mentioned the Solar Panels !!!]

    3. fidodogbreath Silver badge

      Re: I love the future

      people finally get fed up and demand things that work ALL THE DAMN TIME

      My retro-style analog light switches, coffee maker, thermostat, range, fridge, garage door opener, laundry equipment, etc. all functioned perfectly throughout the Great Outage. They also had the additional benefit of being half the price of their (dis)connected brethren.

      I do wish that the cat had gone offline for a while, though. That actually would have been kind of nice.

    4. AndyD 8-)₹

      Re: I love the future

      *My light switch/ [smart meter] does not depend on the Internet and never will.*

      ... but I anticipate that I will soon be penalised (and eventually prosecuted) by British Gas for that intransigence!

  5. Paul 87

    It's terrifying that we've gone from a network designed to survive nuclear attack without loss of communication, to a situation whereby a single company's IT failures affects tens of millions of people.

    Whilst you can argue that the majority of the disruption is, in the scheme of things, minor, the IoT is pointing towards a lot more serious issues further down the line. Imagine what could happen if say, self driving trucks relied on AWS for back end updates of road closures, and due to a crash, couldn't be notified of temporary road closures, nor updated to signal that they should park up.

    1. ecofeco Silver badge

      Damn tourists. :)

    2. careful_Eddy

      Bad, but not that bad

      Any critical service like that should be built with multi-region availability. AWS has 14 regions to choose from and easy DNS features for latency and healthcheck based routing.

      Don't get me wrong, this outage was annoying and Amazon's multiple AZs per region are meant to prevent an entire region falling over. For us it gummed up a bunch of batch jobs we had running and we lost time rejiggering them to not lose data. But our frontend is multi region, clients got directed to west coast and didn't miss a beat.

      1. John Smith 19 Gold badge
        Unhappy

        "Any critical service like that should be built with multi-region availability. "

        Should have. But let's take a peek inside a dev's mind after it happened. Something like this....

        "But, but the time to market was tight and the protocols were complex and AWS hardly ever fails and beside it was going to cost extra and my fried told me no one else does it."

        I think that just about sums up most of the people who did this.

        BTW in real engineering there is the idea of a Licensed engineer. If you design a building and it's built as you specify (IE all materials and procedures followed) and it falls down below design loads it is your fault.

        1. Anonymous Coward
          Anonymous Coward

          Re: "Any critical service like that should be built with multi-region availability. "

          I think that was more the devs boss mind rather than the dev - most devs would not have the luxury of being consulted on things that potentially involve big cash (choice of cloud, backups, hybrid solutions, failsafes), just get told its this solution, work with it. In very few places do devs have a decent degree of input into solutions, mainly just treated as coder for hire with your thoughts / opinions ignored

          AC - obv!

        2. Doctor Syntax Silver badge

          Re: "Any critical service like that should be built with multi-region availability. "

          "BTW in real engineering there is the idea of a Licensed engineer."

          Who needs an engineer when you've got an MBA?

        3. Wensleydale Cheese Silver badge

          Re: "Any critical service like that should be built with multi-region availability. "

          "Should have. But let's take a peek inside a dev's mind after it happened. Something like this...."

          Another common failure with various applications is the assumption that if an internet connection is available at the beginning of a session, then it's there forever.

          Broadband is a lot more reliable than dial-up was, but things still go wrong at the client side, and as more and more work moves to mobile devices, this is a problem which won't go away any time soon.

      2. Doctor Syntax Silver badge

        Re: Bad, but not that bad

        "Any critical service like that should be built with multi-region availability."

        So if you want your light switch to work it has to be built with multi-region availability. Or is this an install-time option?

    3. LDS Silver badge

      "network designed to survive nuclear attack"

      Exactly because it was designed as a distributed system with different paths, and not as a single monolithic architecture putting all the eggs in one basket.... but it was designed by scientists to address an issue, not by MBAs trying to understand how to reinstate big monopolies and extract as much money as possible from users...

      1. Solmyr ibn Wali Barad
        Mushroom

        Re: "network designed to survive nuclear attack"

        And now we've got networks where some faults have a tendency to go nuclear. How quaint.

        Granted, it is rather hard to account for a possibility of getting unwanted positive feedback somewhere in the system that'll lead to catastrophic overamplification. Especially if you can control only a small part of the system.

        But just for fun, I'm going to snap into the old git mode and blame it on whippersnappers having no experience with op-amps these days.

        1. sitta_europea

          Re: "network designed to survive nuclear attack"

          Whippersnappers? Op-amps?

          One of my devs once said to me "I'm going to use an Operational Transconductance Apmlifier".

          I asked him if he meant a 'valve'.

        2. swschrad

          eh, feller, nukes kill op-amps. get off my lawn!

          while tubes, on the other hand, will just keep on working.

          1. Doctor Syntax Silver badge

            Re: eh, feller, nukes kill op-amps. get off my lawn!

            "while tubes, on the other hand, will just keep on working."

            Travelled in London much?

      2. /dev/null

        Re: "network designed to survive nuclear attack"

        You don't still believe that old myth, do you? ARPANET was designed to allow researchers working on ARPA-funded projects to use each other's computers remotely, back in the day when computers were literally few and far between.That is all.

        1. John Smith 19 Gold badge
          IT Angle

          "You don't still believe that old myth, do you? "

          yes and no.

          Part of the ARPANET brief was to design a network that had no single point failure hence no command centre. It's "use case" was to allow different remote access by other institutions to various specialized machines (DEC 10's the ILIAC IV supercomputer of the time).

          The Bell System Electronic Switching System (ESS) had RAM and ROM elements which were designed to be rad hard as well.

    4. Muscleguy Silver badge

      The Kaikoura earthquake in New Zealand knocked out communications. Trucks were stranded between slips, a train was too. In those cases the human driver realised the problem and applied the brakes.

      Kaikoura is still cut off from the North by massive slips on the road/rail line. Initially various mapping and route finding apps were not updating to the alternative inland route bypassing Kaikoura so motorists and trucks were being directed down the blocked coastal road. The police had to permanently man a checkpoint and turn vehicles around with new instructions on the alternative route.

      So we already know the sort of problems an internet reliant automatic vehicle would face.

      Add in that significant parts of NZ have no cell phone coverage, too remote, mountainous, unpopulated to make it economic. Woman caver near Nelson recently fell and injured herself. No cell phone coverage made getting a rescue a problem. Emergency services have radios so once in place it worked, but we are so reliant on cellphones now.

      Isn’t the Met moving from radios to a cellphone based system? . . .

  6. DougS Silver badge

    Shows the folly of IoT

    Can't even TURN OFF your oven? Talk about shitty design! If basic functionality like that is dependent on an internet connection, what happens if the manufacturer goes out of business, or simply decides that it is tired of supporting 10 year old products and takes down the cloud site it relies upon?

    Too bad the general public that is suckered into buying this useless crap doesn't see news like this. I guess we need something like that to cause a fire that kills children to make the national news before it reaches the public consciousness and the deserved blacklash comes against non-tech companies putting "internet" and "IoT" into their products for marketing reasons without any understanding of the consequences.

    1. DropBear Silver badge

      Re: Shows the folly of IoT

      Who knows, maybe the HTML5 UI on the oven's integrated touchscreen linked to a cloud-based JQuery for its "OnClick" action for the "off" button so when that didn't load there was nothing to execute* **...

      *Yeah, I know ancient fossils sometimes tell tall tales of ridiculous "clickable links" that once were purportedly integral parts of webpages and didn't need code to be executed on a click, but those are obviously just invented stories right up there with those hilarious "frames" that clearly never really existed...

      ** Okay, in all actuality this is probably a case of "I went out for some milk knowing I can turn off the oven with the cookies remotely form the supermarket and then it all just fahahahaileeeeed.... *sob* *sob*"

  7. Anonymous Coward
    Anonymous Coward

    Should be the mother of all wakeup calls but I doubt it

    "Nest warned customers that its internet-connected security cameras and smartphone apps were not functioning properly – as in, weren't recording video footage – as a result of the AWS blunder."

    ....Where is the ability to cache for x hours / days in an offline mode???

    "Other IoT devices were also impacted and caused some rather surreal scenarios for their owners. We're told that cloud-connected lightbulbs, thermostats, ovens, and similar gear, stopped working properly as their backends fell over."

    ....Oven burned house down cos cloud backend failed. Insurance will pay?

    1. andyheat

      Re: Should be the mother of all wakeup calls but I doubt it

      I have an internet-connected DVR with 5 cameras. I can view live or recorded footage on my smartphone wherever I am in the world.

      Guess what... it records to a local 1TB hard drive, and depends only on my broadband line. I can't believe there are devices out there that cease to function without an internet connection. Surely an ISP or local phone provider exchange would be more common than an entire AWS DC failing, and manufacturers would have realised the flaw in their design by now?

      1. John Smith 19 Gold badge
        Unhappy

        "Surely an ISP or local phone provider.. more common than an entire AWS DC failing, "

        Well that's the whole point.

        AWS DC failures are rare enough that this bunch of companies thought they did not need to code migration into their "cloud" software.

        Result. "Cloud" reverts to 1 site server farm.

        Server farm fails.

        System is borked.

    2. Anonymous Coward
      Anonymous Coward

      'internet-connected DVR with 5 cameras. it records to a local 1TB hard drive'

      ~ When a leading IoT supplier like Nest has zero fault tolerance regulation is badly needed. But the US has opted out because no one else will follow, so they claim. But this is a disaster.

      ~ I doubt they'll even add heartbeat safety to ovens or similar appliances etc, in the event of overheating when remote smartphones lose connection etc.

      ~ The demise of tech journalism is lamentable. It takes security specialists on unknown blogs to research / reveal weaknesses. Meanwhile all mainstream journalists do is sing IoT's praises.

      1. DougS Silver badge

        Ideally you want both

        You want local storage so you aren't dependent on the internet (i.e. thieves cut the fiber to your building before breaking in) but also cloud storage so taking the DVR with them doesn't help.

        Of course, they could do both, but I'm probably assuming too much intelligence from the average thief thinking they might come up with doing even one of those things...

      2. fidodogbreath Silver badge

        Re: 'internet-connected DVR with 5 cameras. it records to a local 1TB hard drive'

        Meanwhile all mainstream journalists do is sing IoT's praises.

        Present company excepted, of course.

        1. Anonymous Coward
          Anonymous Coward

          'Present company excepted, of course.'

          Of course, the Reg ain't really MSM anyway (Too much of a rebel)! Yeah I'm referring to BBC / RTE in western Europe and just about any US source you care to mention etc...

  8. Anonymous South African Coward Silver badge

    The Cloud...

    ...somebody else's purdy compootah...

    1. yoganmahew

      Re: The Cloud...

      Say, do you think this outage was a computer error?

      1. Yet Another Anonymous coward Silver badge

        Re: The Cloud...

        Ob xkcd

        ps. is there any way to automatically format a link without having to remember html ?

        1. Doctor Syntax Silver badge

          Re: The Cloud...

          "ps. is there any way to automatically format a link without having to remember html ?"

          OTOH if the link is made explicit everyone can see where it goes before they click. I rather like that idea.

    2. Mage Silver badge
      Unhappy

      Re: The Cloud...

      Obviously people should occasionally disconnect their internet and check everything electrical.

  9. localzuk

    So what we now know is...

    That a lot of major sites don't have redundancy built into their design, able to handle a single zone going down.

    1. Anonymous Coward
      Anonymous Coward

      Re: So what we now know is...

      Shouldn't "the cloud" have transparent redundancy built-in? Why the applications should care? If an S3 region becomes unavailable, S3 itself should route applications to another region that hosts a redundant automatic copy....

      1. Loud Speaker

        Re: So what we now know is...

        Granny always told me not to put my eggs in an Amazon shopping basket! And if I want my data, I should back it up on tape*. She also taught me Fortran 2, Squoze and IBJOB.

        * Although probably not the same 7 track at 556bpi that she used.

        1. Anonymous Coward
          Anonymous Coward

          Re: So what we now know is...

          Yeah, but we can't ALL be related to Ada Lovelace (sadly).

          :)

          1. DropBear Silver badge
            Joke

            Re: So what we now know is...

            "Yeah, but we can't ALL be related to Ada Lovelace (sadly)."

            Oh, tosh. "Children of Ada" sounds like as fine a name as any for a brand new cult...

            1. Korev Silver badge
              Coat

              Re: So what we now know is...

              Or some dodgy prog rock band

      2. Doctor Syntax Silver badge

        Re: So what we now know is...

        "If an S3 region becomes unavailable, S3 itself should route applications to another region that hosts a redundant automatic copy"

        That would make things simple for users. Chris Mellor's article points out that S3 stands for Simple Storage Service. So now we know that's simple for Amazon, not for the user.

      3. Missing Semicolon Silver badge
        WTF?

        Re: So what we now know is...

        Yes, I suspect that many bosses, and even devs, thought that this kind of resilience was what AWS was for.

        1. John Smith 19 Gold badge
          Unhappy

          Yes, I suspect that many bosses, ..was what AWS was for.

          TBH I thought so to. That was (it seemed) the USP of a cloud system.

          But apparently not.

          So for anyone who's not coded those features into their software AWS is just a remote sited server farm which you don't own.

          Maybe other cloud providers are better at this than AWS.

          But does anyone know?

    2. DougS Silver badge

      Re: So what we now know is...

      My understanding is that Amazon's cloud is supposed to be redundant. If you hosted on another cloud for redundancy, you made everything more complicated and your odds of an outage probably went up due to that extra complication unless you really know what you're doing.

  10. Anonymous Coward
    Anonymous Coward

    The Cloud...

    Other peoples computers you have no control over.

    IoT devices - needs the Cloud to work - see above for how much control you actually have.

  11. ColonelDare

    Am I Getting Old?

    [ I retired a few years ago now I just can't keep up...]

    Why do we need internet connected light bulbs??

    1. Phil O'Sophical Silver badge

      Re: Am I Getting Old?

      We don't. The question is why do people want Internet-connected lightbulbs.

      1. Sgt_Oddball Silver badge

        Re: Am I Getting Old?

        They don't but marketers saw something shiny d and demanded it.

        Pretty sure there's whole years of dilbert comics dedicated to this very issue.

        1. VinceH Silver badge

          Re: Am I Getting Old?

          "They don't but marketers saw something shiny d and demanded it."

          s/demanded it/convinced the gullible they needed it/

          And could see the potential for gain for them and the companies they work for, with all the lovely data they'd get, and more potential control over those customers.

          To slightly misquote Gary Numan's (Dark) lyrics:

          "I need hostility connectivity to lead the faithful and the blind."

      2. Craig 2

        Re: "The question is why do people want Internet-connected lightbulbs."

        The real question is how do people relate to light bulbs and do they want them nasally-fitted?

        1. Jonathan Knight

          Re: "The question is why do people want Internet-connected lightbulbs."

          What about this wheel thing?

          1. ricardian

            Re: "The question is why do people want Internet-connected lightbulbs."

            Wheel thing? Just a passing phase. Mark my words, it'll never catch on

            1. DropBear Silver badge
              Flame

              Re: "The question is why do people want Internet-connected lightbulbs."

              "Wheel thing"

              I suggest you keep it down and I'll pretend I never heard you. Have you forgotten already what happened when we tried writing the OHS framework for the Fire Thing?!?

      3. cork.dom@gmail.com

        Re: Am I Getting Old?

        A lot of disinformation here regaridng Philips Hue bulbs.

        The Hue lightbulbs work perfectly well without any IT whatsoever. Guess how?? Yep - that trusty old light switch turns them on and off. Just like a real bulb. So i can turn off my phone, my router and my PC and i can still turn them on and off - using a light switch. They perform just like normal light bulbs.

        If i want to change them to a different colour, (or turn them on and off remotely), my router needs to be on, because the app to change the colour is on my phone and therefore my phone needs to be able to talk to the bulbs (they are not Bluetooth bulbs).

        Following so far? ;)

        If my router is connected to the internet, the bulbs can then download patches and interface with services such as IFTTT. But they certainly do NOT require internet connectivity for basic operation.

        1. ParaHandy

          Re: Am I Getting Old?

          So you've turned your lightbulb off remotely via the internet. The switch on the wall in on. Internet goes down. How do you turn your lightbulb back on? In this scenario is basic operation not compromised? Genuine question.

    2. Anonymous Coward
      Anonymous Coward

      Re: Am I Getting Old?

      >Why do we need internet connected light bulbs??

      Most commonly to be 'in when you're out' - but a side benefit for disabled users is a Smart Home setup is now measured in £100/1000's instead of £10,000's. Hasn't killed suppliers of the latter and the related gravy trainers who somehow still claw staggering amounts from the NHS, but has changed a lot of lives......and before you cite this as an example of Cloud's unsuitability - specialist AT is not only hugely expensive but incredibly unreliable, buggy as hell and implemented using decades old tech which wouldn't find an application elsewhere.

      1. Muscleguy Silver badge
        FAIL

        Re: Am I Getting Old?

        Your points are good ones, but there is a related issue. IoT services for the elderly and the disabled are useful and good. BUT the economics of these applications are not good. Takeup amongst the elderly and confusion over how to make it all work allied to aged cussedness mean the market amongst the elderly is likely not large even though the benefits are manifest.

        So, to make the economics of this work they have to sell them to fit, healthy, able Joe Public and for us the utility beyond ‘Ooh! Shiny!’ is simply not there.

        You read things like the guy who couldn’t get his kettle to boil. When after 8 hours it finally worked he had to eat his tea in the dark as his lights were downloading an update and were offline. His lights had power, the bulbs were not blown but they could not be turned on when needed. This is a health and safety issue and means the products are not fit for the likes of the elderly. Do you want your Granny to fall and break her hip because her lights won’t turn on because they are offline?

        1. ColonelDare

          Re: Am I Getting Old?

          > Muscleguy & AC above that...

          Good points, yes the elderly need support, including my 96 yo mother living alone until recently. Struggling with failing eyesight and memory loss she couldn't read the time on her very large analogue clock. Enter the Raspberry Pi, a large TV screen, a bit of Python/Pygame code (with hi contrast colours) and we solved the problem. :-)

          With touch sensitive table lights and socket mounted timers (etc) we weren't hankering for internet light bulbs - and certainly no AWS in the loop.

          - See; I still don't get it. ;-)

  12. msage

    Nest....

    Errr I have a question... Why is nest sitting on AWS? Google not prepared to eat their own dog food? That I think was the most interesting part of the whole story!

    1. Anonymous Coward
      Anonymous Coward

      Re: Nest....

      I'm not sure about nest but quite a few shiny IoT soho DVR systems sell the off-site cloud storage as their continuing revenue stream; small-print says something like "give us £30 per month or your system will not work"

      I found a nice netatmo (French!) IoT biometric indoor motion-sensitive DVR that records to local microSD (with a free option to additionally dump to DropBox) the biggest problem for me from netatmo is the DVR will only boot up with their specific netatmo 5V micro-usb PSU wall-wart, I had planned to supply via USB UPS.

  13. Anonymous Coward
    Anonymous Coward

    Heroku

    Interesting for me is Heroku. Now we all know that Heroku lives on AWS, no issue there, but they reacted by taking their whole management API down. Which meant that nobody could scale to meet the change in demand or reconfigure to route elsewhere. And they were down for at lot longer than AWS were.

    Tested and found wanting, in my view.

  14. Roj Blake Silver badge

    Rule 1

    Rule 1 for cloud providers: make sure your status page is on someone else's infrastructure.

    Poor show AWS, poor show.

    1. BagOfSpanners

      Re: Rule 1

      Most of the status pages I've seen seem to be run by the marketing department rather than directly linked to the service they claim to be monitoring. They generally don't admit there's a problem until several hours after it started, and use weasel words to minimise the apparent size of the problem. I don't trust them.

    2. Anonymous Coward
      Anonymous Coward

      Re: Rule 1

      Or, at the very least a separate set of isolated infrastructure inside your cloud property. This outage is a bit more than egg on the face to be blamed on power. It is obvious they use the same infrastructure for their dashboard and that is plain stoopid.. They bought this one through design flaws and I wouldn't blame anyone for getting off of AWS as soon as they can. I don't precisely know what happened here - could be a simple DNS issue, or something more severe - but it shows the AWS infrastructure design is very flawed. Hate to say it, but there should be a watchdog org that verifies public cloud provider infrastructure. So much rests on that infrastructure that something needs to be done. I once heard a budding cloud entrepreneur tell about how he started his cloud storage business in his garage. Oh, right, I bet he never told his customers where his secure and standards-conforming data center was.

  15. Artaxerxes

    Good job we got rid of all that annoying kit in our datacenters in favour of the ever reliable Cloud isn't it?

    Oh. Oh shit.

    1. Solmyr ibn Wali Barad

      Sometimes it happens for quite anecdotal reasons. At one company it was because MD did not like the humming noise coming from their server room just over the corridor.

      So IT department got an order to scrap servers and move everything to the cloud. They weren't happy. In a true BOFH spirit they contemplated splashing few grand on soundproofing and setting up a company à la "Cloudy McCloudface Ltd." to issue invoices. But eventually they chose to be good sports and went for a reliable colocation company. Which happened to have "Certified Cloud Solutions Provider" prominently written on their marketing brochure.

  16. MrXavia

    Hopefully a wake up call for connected device makers, don't make your tools reliant on the internet!

    Even if you rely on the cloud, make sure they can run by talking directly to a local app!

    1. kain preacher Silver badge

      Hopefully a wake up call for connected device makers, don't make your tools reliant on the internet!

      Wake up call ha. more like they hit the snooze button. The wake up call should of been when video game makes made single player mode require an always on intent connection and made the game save to the cloud. I believe gears of wars players ran into issues were the servers crashed for days and you could not play.

      One of the congress critters actually bought clue and the DMCA was update for games that require an always on internet for single player mode.

      As of October 2015, always-online games with single player modes that now have had dead servers for six months and longer are now exempt from DMCA prohibitions on circumventing copyright protection

  17. Stuart 22

    "Why do we need internet connected light bulbs?

    Most commonly to be 'in when you're out'"

    Yea, well, I use time controllers. Do the job perfectly barring power failure when you won't have a light anyway. Their only issue is re-setting the time (and finding the unintuitive instructions), I find more intellectually challenging then setting up a new cloud instance ... but then I'm a born masochist.

    1. Lotaresco

      "Their only issue is re-setting the time (and finding the unintuitive instructions), I find more intellectually challenging then setting up a new cloud instance"

      IKEA mechanical timers. Cheap, effective, no need for a manual. Other purveyors of 1970s technology exist.

      1. Peter Gathercole Silver badge

        @Lotaresco

        Chances are the clock in a mechanical timer is an electric one. When the power goes out, the clock stops. When it comes back on, unless you are exceedingly lucky and have had a multiple of 12 hour (or 24 hour if you have a 24 hour clock) outage, the clock will be wrong and you will need to set it.

        But it's usually a matter of turning it until it's correct again.

        1. DropBear Silver badge

          Re: @Lotaresco

          "In when you're out" - I saw some funny coloured LED lamps on sale just the other day; I was wondering why they seem to have that strange reflector-like shape until it dawned on me these are supposed to project a randomly flickering muted RGB lightshow onto your wall - not entirely unlike the one made by a turned-on TV set in a room - for the benefit of "legally-challenged uninvited guests"...

        2. Doctor Syntax Silver badge

          Re: @Lotaresco

          "Chances are the clock in a mechanical timer is an electric one."

          Ha. A few weeks ago the clock on the CH boiler had a little problem. It would run until it came to the start of an on period. It's an electric clock so I'd expect it to work unless the mains went off but with mains behind it why should it fail like this? Replaced under warranty and I took the old one apart. It's a battery operated quartz clock with the battery charged, as far as I could see, by a diode & dropper from the mains. Presumably when the battery goes on the blink it doesn't have quite enough voltage to trip the switch.

          As a price for the clock not stopping when the mains goes off - I can live with that as it would be just another clock to reset - I have a timer that fails to work at all after a few years service.

  18. tentimes

    Happens so often

    I deliberately didn't choose AWS because they have a long history of catastrophic outages. Nice to see it is still going. I made the right choice with the right redundancy.

  19. phuzz Silver badge
    Facepalm

    One of our customers does use AWS, but only as a third backup behind their two physical datacenters.

    The problem is, their main website uses a whole bunch of javascript widgets, some of which, yup, you've guessed it, were hosted in the failed region of AWS. So, their home page would only load up to a certain point, and then would sit, waiting for the third party javascript that would never load.

    So, even if, as the sysadmins, you've done all you can to make the site redundant and fault tolerant, never underestimate the opportunity for the developers to fsck it right up.

    1. Doctor Syntax Silver badge

      "a whole bunch of javascript widgets, some of which... were hosted in the failed region of AWS"

      A whole extra area of cloud-based fail waiting to happen.

      1. Missing Semicolon Silver badge

        Yes, why do web authors insist on loading JS from the 4 corners of the internet instead of just copying the files to the site's server and loading it all from the same domain?

        1. Doctor Syntax Silver badge

          "why do web authors insist on loading JS from the 4 corners of the internet"

          In fact, why do they insist on using so much of it?

  20. pauleverett

    the trouble with relying on cloud services is...

    Could go down...

    Probably will go down...

    Guaranteed to go down...

    All this is known, but people choose to ignore the realities of cloud storage.

    I know it is all super convenient. But if you use cloud storage as the foundation of your livelihood, its just a matter of time before you get burned alive.

  21. lukewarmdog

    Question re the IoT devices.. do they fail open or closed?

    If you were cooking a delicious roast dinner for the in-laws in an attempt to impress and had set it all up from work so that it would be ready when you got home.. would you get home to a cooked dinner or just a freezing cold flat?

    Same with lightbulbs, if they were on would they remain on?

    Wonder if any internet connected burglars noticed and figured it would be a great time to go on a bit of a spree knowing a whole bunch of security systems were suddenly not working. And whether the insurance would still pay up in that situation.

  22. Archtech Silver badge

    Excelsior!

    As the old saying goes, "To err is human; but to really foul things up you need a computer".

    To that we can now add, "And if you want to really foul things up for millions of people worldwide, you need the cloud".

    Please, never tell me that the days of progress are over.

    1. Doctor Syntax Silver badge

      Re: Excelsior!

      "never tell me that the days of progress are over."

      They're not. The direction of progress, however...

  23. Archtech Silver badge

    World's tallest house of cards?

    "Other IoT devices were also impacted and caused some rather surreal scenarios for their owners. We're told that cloud-connected lightbulbs, thermostats, and similar gear, stopped working properly as their backends fell over".

    This gives a whole new meaning to "building on sand". Doesn't anyone take Engineering 101 any more? What do all those fools in "risk management" departments do all day long? Oh yes, that's right - bend over when the Big Cheese says, "We're doing it because it's cheaper and all the other bosses are doing it so I don't want to look old-fashioned and out of touch".

    " I can't change my mouse sensitivity because @razer @razersynapse servers are down...

    "Joys of the @internetofshit - AWS goes down. So does my TV remote, my light controller, even my front gate. Yay for 2017".

    Funniest things I've heard since 1993, when Grady Booch told a conference about how some dinner guests of his spent an uncomfortable few minutes being ignored on the front porch. Seems it's not good design to wire up your front door bell through a server that sometimes goes down unnoticed...

    1. Doctor Syntax Silver badge

      Re: World's tallest house of cards?

      Doesn't anyone take Engineering 101 any more? What do all those fools in "risk management" departments do all day long?

      What's he talking about? Engineering? Risk management? Why does he think we need those? We've got MBAs.

  24. BagOfSpanners

    I thought S3 was a worry-free storage option

    Having recently emerged from an AWS exam, I thought that one of the selling points of S3 was that data is automatically replicated across multiple availability zones within a region without the customer needing to worry about the details. I also thought that the availability zones within a region were highly isolated from each other (e.g. separate data centres in different cities). I guess I'm wrong about at least one of those things.

    At least the problem was largely fixed the same day. When problems occur within my employer's on-premises infrastructure, it usually takes several days to get it fixed, including a phase during which even the existence of the problem is denied.

    1. Anonymous Coward
      Anonymous Coward

      Re: I thought S3 was a worry-free storage option

      This is one of the reasons I chose IBM Cloud Object Storage to ensure that my data is Geo dispersed using erasure coding techniques. The S3 blunder would not happen. IBM Bluemix and Cloud Object Storage all the way for me and my clients!

  25. Anonymous Coward
    Anonymous Coward

    Cant change mouse sensitivity?

    Lol what?

    Since when does a peripheral need to contact a server to configure itself?

    1. Down not across Silver badge

      Re: Cant change mouse sensitivity?

      Can't speak for mice, but Logitech Harmony remotes are apparently only configurable via cloudy stuff. Shame really as I would've purchased some. Ah well I'll stick with my old Marantz RC 5000.

  26. Allan George Dyer Silver badge

    If only AWS read xkcd...

    https://xkcd.com/908/

    1. Anonymous Coward
      Anonymous Coward

      Re: If only AWS read xkcd...

      Wrong XKCD reference. To reference a prior commentator, sure AWS will replicate data across availability zones in a region. In this case, the entire region had issues, so it was up to the customer to span multiple regions. In other words, this one applies:

      https://xkcd.com/1737/

  27. Patched Out
    Mushroom

    New acronym

    CLOUD - Control Lost On User Devices

  28. Mage Silver badge
    Coat

    But Cloud is better than In House!

    Yes, loads of companies do in house stuff badly or don't bother with resilience or disaster planing etc. So the argument is that the Cloud is better.

    Maybe from the point of view of the users in one company cloud is better than In house, perhaps you can't then order from one supplier when their in house IT falls over if they don't use cloud.

    But the "cloud" could mean that no-one can order from anyone. Instead of just RBS or HSBC being down all banks, Mobile billing (so no mobile calls due to no credit, PAYG or Bill Pay), no ATMS, no POS, no card payments ...

    Maybe fantasy today, but not as more companies outsource to cloud EVEN if it's done better than in house. Not as we head toward various mono cultures. It won't be a cyber war, but a Friday afternoon patch to Edge Routers, or load balancing, or DNS servers, or database etc.

    The famines in the 19th Century (not just in Ireland) were due to mono culture.

    The very concept and "savings involved" of Cloud Computing is heading all of the first World to a cyber potato event horizon.

  29. M7S

    Wry smile

    There was a lecture at Gresham College the other day (free, open to public on a wide variety of topics, worth a look around the website as transcripts etc are put online) titled "Living Without Electricity" about our dependency on power for normal life, communications etc

    https://www.gresham.ac.uk/lectures-and-events/living-without-electricity

    The transcripts in this case didn't go up for a couple of days, probably as the address is

    https://s3-eu-west-1.amazonaws.com/content.gresham.ac.uk/data/binary/2413/2017-02-28_RogerKemp_Electricity_.docx

    I assume the Prof might have found this ironic.

  30. voadenrsg

    Too big to fail ?

    When something is too big to fail maybe its too big ?

    Haven't we been here before... think Banks ?

  31. Balefire

    Lucky Me

    Thankfully I was told by the SSE meter engineer who came round to fit a "Smart" meter (one I hadn't asked for) that there wasn't enough room where the old meter was. He was surprised when I told him that I didn't want one anyway.

  32. oneeye

    HAL 9000 says " I'm sorry Alexa,.....but I can't do that"

  33. IoTedge
    FAIL

    IoT Needs Edge Computing

    This is exactly why IoT infrastructure cannot be cloud-only. You need cloud combined with on-premise edge computing for security, speed and guaranteed uptime.

  34. Richard Pennington 1
    Pirate

    Your first lesson on Single Points of Failure.

    If you must insist on loading up your life with Insecurely Designed Internet of Things (IDIoT) devices, don't be surprised when a single failure in the Cloud wipes out your entire existence.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019