Dublin-based energy supplier ESB Networks has dismissed Amazon's claim that a bolt of lightning caused the chain of events that led to its web outage last weekend. The US giant today entered the sixth day of trying to resolve issues created by the crash at its Dublin data centre. First it blamed a lightning striking on a …
I was hit by this - one of my two EC2 instances stopped responding. I had good enough backups to have a server up and running in one of the other two availability zones pretty quickly, and switched the IP over, but the dead volume still isn't fully recovered now - the 'recovery' snapshot came online yesterday, but when I attached it to the new instance it couldn't be read.
To be fair, the AWS team have been responding very quickly on the support forum - but they had a power failure which shouldn't have happened, their systems didn't handle it as well as they should have - then to cap it all, a rogue garbage-collector eats chunks of our snapshots as well. Trying to access the mangled wreckage will sometimes lock up your virtual machine, for no apparent reason, and ... oh well. At least it's cheap. Maybe it needs to be, so you can cluster half a dozen expendable machines to have something that actually works most of the time...
Mostly likely they had never actually done a real power fail test on the installation.
I remember years ago watching a test of the backup generators at a shopping center in Oxford. The switchover was quite spectacular since there had been wiring error resulting in the 440V phases being connected incorrectly. Quite a decent bang and lots of smoke - none of which would have shown up on a simulated power failure.
Perhaps this is Amazon's new invention…
…The Interruptable UPS.
(Okay okay, clearly not everyone there is that daft, but *someone* in Dublin was dimwitted enough to install such a thing.)