back to article AWS's S3 outage was so bad Amazon couldn't get into its own dashboard to warn the world

Tuesday's Amazon Web Services mega-outage knocked offline not only websites big and small, by yanking away their backend storage, but also knackered apps and Internet of Things gadgets relying on the technology. In fact, the five-hour breakdown was so bad, Amazon couldn't even update its own AWS status dashboard: its red …

FAIL

Major irony alert

"Ironically, outage monitoring sites DownDetector and isitdownrightnow.com were also offline, thanks to the issue."

30
1
Anonymous Coward

Re: Major irony alert

They might need to change the domain name to "itisdownrightnow.com". ;) FYI, isup.me failed over to AWS UK infrastructure, so I used it to monitor the half-dozen other outage nonitoring sites that relief on a single S3 bucket region. ;P

5
0
Silver badge

Maybe I'm having a sense of humour failure, but those quoted Tweets can't be serious? Someone complaining they couldn't change their mouse sensitivity? Or turn off their oven?

I fear for this generation.

33
3
Facepalm

I'm also interested in the veracity of these claims. Razer's cloudy settings are an abomination. My familial Razer Deathadder mpuse owner is out with his girlfriend, so I'm sitting here alone trying to disprove it.

3
0

The generation isn't as much the problem as the idiots who built those items to not work without internet connectivity.

Yea, my ISP went tits up so I can't bake a cake. WTF?

48
1
Silver badge

@Phil Kingston

My Wacom Intuos tablet had a driver update (on Windows) that totally changed the settings GUI and gave me a cloud based parameters storage and loading 'facility'. The idea is that I can more easily manage its settings and have them backed up in case of accidents and also, of course, easily migrate my tablet between different computers.

I uninstalled the drivers and reinstalled from the CD then blocked Wacom software at the firewall.

27
0

Re: @Phil Kingston

We used to have these things called ini files (or config files or whatever) that you could COPY from one machine to another. Later on, ou could even put them onto cloudy storage if you needed them elsewhere and couldn't be arsed carrying a thumb drive.

31
0
Silver badge

AWS goes down. So does my TV remote, my light controller, even my front gate.

Rightly or wrongly, all I can think when I see this is 'Serves you right.'

45
1
Silver badge

" Someone complaining they couldn't change their mouse sensitivity?"

From the Razer site:

Razer Synapse is our unified configuration software that allows you to rebind controls or assign macros to any of your Razer peripherals and saves all your settings automatically to the cloud. No more tedious device configurations when you arrive at LAN parties or tourneys, as you can pull them from the cloud, and get owning right away.

6
0
Silver badge

"Serves you right."

I can't help but think you're being a bit harsh here - for instance, ever since the Dawn of Time, if you don't feel like using 156345346 different remotes the only universal ones that are sold anywhere are basically the Logitech Harmony series; and yes, they come with a cloud-only config tool, whether you like it or not. Yes, they do _work_ without being online*, but they cannot be reconfigured. Believe me, I would never willingly chose such a setup but what choice do I really have...?

* I have no idea whether a similar issue is at hand in this particular case, or if this remote wasn't working at all...

1
0
Silver badge

All together now...

Told You So!

12
0

It's been a few years since I had / used my Logitech Harmony remote, but back then the tool to configure the remote is online as you say. However, who is changing macros on their remote on a daily basis? Once it's setup you normally only need to change the config when adding a new device.

Agree with the overall sentiment, but as the Harmony remotes rely on an enormous database of known devices it sort of makes sense for it to be online. (As the database gets updated daily)

3
0

Re: @Phil Kingston

Indeed this 'instead of' methodology that many IoT thingies have is a serious worry.

The IoT aspects should be a layer on top of, not instead of, a working self-contained system.

App settings - sure, back them up so you have them preserved or migratable to a new system. But don't have them as the sole storage for settings.

4
0
Silver badge

The last voice command ever

"Alexa, turn off all the servers."

48
0

Re: The last voice command ever

Alexa had her own problems.

Time, weather reports and radio were fine but couldn't actually play music.

She'd read back to show she'd understood but then nothing.

Silly Amazon.

8
0

Re: The last voice command ever

I had exactly the same, which seems odd if it's only one region in the US that failed. Do they not have local copies of Amazon Prime music in the UK?

1
0
Silver badge

Re: The last voice command ever

> Do they not have local copies of Amazon Prime music in the UK?

Don't be silly. If they stored anything here they might have to pay tax. ;-)

5
0
Silver badge
Trollface

Re: The last voice command ever

"She'd read back to show she'd understood but then nothing."

Awww, people today have no sense of humour. It should have started singing "Daisy, daisy..."

21
0
Anonymous Coward

Re: The last voice command ever

Silly Amazon?

No, Silly you.

Why did you waste your money on it in the first place?

7
2
Silver badge
Alien

Big Bang 2.0

Alexa, reboot the universe!

1
0

Re: The last voice command ever

"If you want a thing done well, get a couple of old broads to do it."

-- Bette Davis

If neither of those broads is named Alexa. I just now asked my Echo if it was all right, and it responded, "Great! I'm ready to stroke your man-parts or whatever!" Um, okay.

I think I speak for all of us when I say that my sympathies go to the poor soul who couldn't order a coffee. OH. MY. GOD. S/he's probably still shaken by the experience.

There's a reason why this nonstop idiocy is called a "first-world problem." The reference isn't meant to be self-congratulatory or a compliment.

3
0
Silver badge
Flame

I love the future

This is great : a cloud service falls down so hard it can't even notify customers it is down. And, way down the line, thermostats can no longer be changed, mouse settings are frozen and God knows what else.

This is absolutely perfect and should happen a lot more often until people finally get fed up and demand things that work ALL THE DAMN TIME, like they used to before this happy-happy age of sharing everything with the NSA whether you want to or not.

IoT ? Not while I still have a functioning brain, thank you very much. My light switch does not depend on the Internet and never will.

53
3

Re: I love the future

I am rather assuming that IoT is an abbreviation for "IdioT"... right?

10
1
Silver badge
Trollface

Re: I love the future

So, um, if I tell you there is a Python IoT home automation controller called "idiotic", will you be surprised...?

1
0
Flame

Re: I love the future

“IoT ? Not while I still have a functioning brain, thank you very much”

I agree with the sentiment 100%. However I fear IoT will be foisted onto the unsuspecting public, in various guises, whether they want it or not. ‘Smart’ meters being a prime example, which are presently being aggressively deployed by energy companies in the UK.

9
0
Silver badge

Re: I love the future

people finally get fed up and demand things that work ALL THE DAMN TIME

My retro-style analog light switches, coffee maker, thermostat, range, fridge, garage door opener, laundry equipment, etc. all functioned perfectly throughout the Great Outage. They also had the additional benefit of being half the price of their (dis)connected brethren.

I do wish that the cat had gone offline for a while, though. That actually would have been kind of nice.

3
1

Re: I love the future

*My light switch/ [smart meter] does not depend on the Internet and never will.*

... but I anticipate that I will soon be penalised (and eventually prosecuted) by British Gas for that intransigence!

0
0
Anonymous Coward

Re: I love the future

"... ‘Smart’ meters being a prime example, which are presently being aggressively deployed by energy companies in the UK."

Not agressively enough !!!

I have been with 2 Electricity/Gas Suppliers that have 'shouted' from the rooftops their wonderful 'Smart' meter functionality ..... only to be told that they could not fit a 'Smart' meter due to the existence of Solar Panels (new install covered by FITS) or not available in your area !!!

How can you design a Smart Meter that is unable to cope with self-generation of power when they are everywhere in the UK and not exactly leading edge Technology.

The 'Not available in your area' was only discovered AFTER I had changed Supplier !!! :(

[I was assured that I could get a Smart Meter when I queried about it and mentioned the Solar Panels !!!]

0
0

It's terrifying that we've gone from a network designed to survive nuclear attack without loss of communication, to a situation whereby a single company's IT failures affects tens of millions of people.

Whilst you can argue that the majority of the disruption is, in the scheme of things, minor, the IoT is pointing towards a lot more serious issues further down the line. Imagine what could happen if say, self driving trucks relied on AWS for back end updates of road closures, and due to a crash, couldn't be notified of temporary road closures, nor updated to signal that they should park up.

43
0
Silver badge

Damn tourists. :)

1
0

Bad, but not that bad

Any critical service like that should be built with multi-region availability. AWS has 14 regions to choose from and easy DNS features for latency and healthcheck based routing.

Don't get me wrong, this outage was annoying and Amazon's multiple AZs per region are meant to prevent an entire region falling over. For us it gummed up a bunch of batch jobs we had running and we lost time rejiggering them to not lose data. But our frontend is multi region, clients got directed to west coast and didn't miss a beat.

11
0
Gold badge
Unhappy

"Any critical service like that should be built with multi-region availability. "

Should have. But let's take a peek inside a dev's mind after it happened. Something like this....

"But, but the time to market was tight and the protocols were complex and AWS hardly ever fails and beside it was going to cost extra and my fried told me no one else does it."

I think that just about sums up most of the people who did this.

BTW in real engineering there is the idea of a Licensed engineer. If you design a building and it's built as you specify (IE all materials and procedures followed) and it falls down below design loads it is your fault.

22
0
LDS
Silver badge

"network designed to survive nuclear attack"

Exactly because it was designed as a distributed system with different paths, and not as a single monolithic architecture putting all the eggs in one basket.... but it was designed by scientists to address an issue, not by MBAs trying to understand how to reinstate big monopolies and extract as much money as possible from users...

18
0
Silver badge
Mushroom

Re: "network designed to survive nuclear attack"

And now we've got networks where some faults have a tendency to go nuclear. How quaint.

Granted, it is rather hard to account for a possibility of getting unwanted positive feedback somewhere in the system that'll lead to catastrophic overamplification. Especially if you can control only a small part of the system.

But just for fun, I'm going to snap into the old git mode and blame it on whippersnappers having no experience with op-amps these days.

6
0
Anonymous Coward

Re: "Any critical service like that should be built with multi-region availability. "

I think that was more the devs boss mind rather than the dev - most devs would not have the luxury of being consulted on things that potentially involve big cash (choice of cloud, backups, hybrid solutions, failsafes), just get told its this solution, work with it. In very few places do devs have a decent degree of input into solutions, mainly just treated as coder for hire with your thoughts / opinions ignored

AC - obv!

10
0
Silver badge

The Kaikoura earthquake in New Zealand knocked out communications. Trucks were stranded between slips, a train was too. In those cases the human driver realised the problem and applied the brakes.

Kaikoura is still cut off from the North by massive slips on the road/rail line. Initially various mapping and route finding apps were not updating to the alternative inland route bypassing Kaikoura so motorists and trucks were being directed down the blocked coastal road. The police had to permanently man a checkpoint and turn vehicles around with new instructions on the alternative route.

So we already know the sort of problems an internet reliant automatic vehicle would face.

Add in that significant parts of NZ have no cell phone coverage, too remote, mountainous, unpopulated to make it economic. Woman caver near Nelson recently fell and injured herself. No cell phone coverage made getting a rescue a problem. Emergency services have radios so once in place it worked, but we are so reliant on cellphones now.

Isn’t the Met moving from radios to a cellphone based system? . . .

11
0
Silver badge

Re: "Any critical service like that should be built with multi-region availability. "

"BTW in real engineering there is the idea of a Licensed engineer."

Who needs an engineer when you've got an MBA?

10
0

Re: "network designed to survive nuclear attack"

Whippersnappers? Op-amps?

One of my devs once said to me "I'm going to use an Operational Transconductance Apmlifier".

I asked him if he meant a 'valve'.

2
0

eh, feller, nukes kill op-amps. get off my lawn!

while tubes, on the other hand, will just keep on working.

4
0
Silver badge

Re: "Any critical service like that should be built with multi-region availability. "

"Should have. But let's take a peek inside a dev's mind after it happened. Something like this...."

Another common failure with various applications is the assumption that if an internet connection is available at the beginning of a session, then it's there forever.

Broadband is a lot more reliable than dial-up was, but things still go wrong at the client side, and as more and more work moves to mobile devices, this is a problem which won't go away any time soon.

6
0
Silver badge

Re: Bad, but not that bad

"Any critical service like that should be built with multi-region availability."

So if you want your light switch to work it has to be built with multi-region availability. Or is this an install-time option?

1
0
Silver badge

Re: eh, feller, nukes kill op-amps. get off my lawn!

"while tubes, on the other hand, will just keep on working."

Travelled in London much?

3
0

Re: "network designed to survive nuclear attack"

You don't still believe that old myth, do you? ARPANET was designed to allow researchers working on ARPA-funded projects to use each other's computers remotely, back in the day when computers were literally few and far between.That is all.

1
0
Gold badge
IT Angle

"You don't still believe that old myth, do you? "

yes and no.

Part of the ARPANET brief was to design a network that had no single point failure hence no command centre. It's "use case" was to allow different remote access by other institutions to various specialized machines (DEC 10's the ILIAC IV supercomputer of the time).

The Bell System Electronic Switching System (ESS) had RAM and ROM elements which were designed to be rad hard as well.

1
0
Silver badge

Shows the folly of IoT

Can't even TURN OFF your oven? Talk about shitty design! If basic functionality like that is dependent on an internet connection, what happens if the manufacturer goes out of business, or simply decides that it is tired of supporting 10 year old products and takes down the cloud site it relies upon?

Too bad the general public that is suckered into buying this useless crap doesn't see news like this. I guess we need something like that to cause a fire that kills children to make the national news before it reaches the public consciousness and the deserved blacklash comes against non-tech companies putting "internet" and "IoT" into their products for marketing reasons without any understanding of the consequences.

23
0
Silver badge

Re: Shows the folly of IoT

Who knows, maybe the HTML5 UI on the oven's integrated touchscreen linked to a cloud-based JQuery for its "OnClick" action for the "off" button so when that didn't load there was nothing to execute* **...

*Yeah, I know ancient fossils sometimes tell tall tales of ridiculous "clickable links" that once were purportedly integral parts of webpages and didn't need code to be executed on a click, but those are obviously just invented stories right up there with those hilarious "frames" that clearly never really existed...

** Okay, in all actuality this is probably a case of "I went out for some milk knowing I can turn off the oven with the cookies remotely form the supermarket and then it all just fahahahaileeeeed.... *sob* *sob*"

4
0
Anonymous Coward

Should be the mother of all wakeup calls but I doubt it

"Nest warned customers that its internet-connected security cameras and smartphone apps were not functioning properly – as in, weren't recording video footage – as a result of the AWS blunder."

....Where is the ability to cache for x hours / days in an offline mode???

"Other IoT devices were also impacted and caused some rather surreal scenarios for their owners. We're told that cloud-connected lightbulbs, thermostats, ovens, and similar gear, stopped working properly as their backends fell over."

....Oven burned house down cos cloud backend failed. Insurance will pay?

13
0

Re: Should be the mother of all wakeup calls but I doubt it

I have an internet-connected DVR with 5 cameras. I can view live or recorded footage on my smartphone wherever I am in the world.

Guess what... it records to a local 1TB hard drive, and depends only on my broadband line. I can't believe there are devices out there that cease to function without an internet connection. Surely an ISP or local phone provider exchange would be more common than an entire AWS DC failing, and manufacturers would have realised the flaw in their design by now?

11
0
Gold badge
Unhappy

"Surely an ISP or local phone provider.. more common than an entire AWS DC failing, "

Well that's the whole point.

AWS DC failures are rare enough that this bunch of companies thought they did not need to code migration into their "cloud" software.

Result. "Cloud" reverts to 1 site server farm.

Server farm fails.

System is borked.

11
0
Anonymous Coward

'internet-connected DVR with 5 cameras. it records to a local 1TB hard drive'

~ When a leading IoT supplier like Nest has zero fault tolerance regulation is badly needed. But the US has opted out because no one else will follow, so they claim. But this is a disaster.

~ I doubt they'll even add heartbeat safety to ovens or similar appliances etc, in the event of overheating when remote smartphones lose connection etc.

~ The demise of tech journalism is lamentable. It takes security specialists on unknown blogs to research / reveal weaknesses. Meanwhile all mainstream journalists do is sing IoT's praises.

7
0

Page:

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Forums

Biting the hand that feeds IT © 1998–2018