Let's hope these were duplicate IPv4 addresses. Duplicate IPv6 would be unforgivable.
Google gave some of its cloud customers a rotten weekend by breaking a bunch of virtual machines. Detailed in this incident report, the company first noticed problems at nearly beer o’clock on Friday afternoon, June 15th, Pacific Time – just after midnight on Saturday for European users and early Saturday morning in Asia. The …
The call-out rate for Cloud-based services is determined by the "risk" multiplicand I call the "Cloud Fuctor".
The more services that one has "in the Cloud", the higher the "Cloud Fuctor" multiplicand, which will always be ≥ 1.0
The risk can be mitigated by distributing across multiple providers.
- The risk can be mitigated by building and managing your own sh*t. Letting others manage your stuff just makes no sense.
Well that depends on how many bits of kit you would have to manage, and how many people you have available and competent to manage them.
The risks involved in hosting your own are that you end up coming in on Monday to find everything offline because even your monitoring system has gone down. Then you have to figure out what has gone wrong where and drive round the country fixing bits...
There are risks with cloud deployments, but those risks can be measured and compared with 'in house' risks... and the balance, for some/many may well be in favour of the cloud.
will slowly increase in number and will take longer to be solved. So far everything is under control because those maintaining and managing the cloud infrastructure are those brilliant minds who designed it and set it up. Come the next generation and the one after things will be in dire situation when those called to support this massive infrastructure will have to rely on documentation, good or bad. Yes, there will be automation with scripts running the show but when things will go wrong it will all become tricky.
<blockquote>will slowly increase in number and will take longer to be solved.</blockquote>
Interesting point. There is presumably a "Fault Surface" similar or maybe identical to the malware "attack Surface" that expands as interfaces become more "flexible" and complex. Problem is that the intelligence of those managing the interfaces doesn't expand to match the increasing size of the Fault Surface.
Back in the 1960s, as we discovered that implementing simple ideas on computers was anything but simple, we used to say the FLEX was a four letter. Brace yourself cloud-people, we are probably going to be flexed repeatedly in coming years.
Once a month: An engineer makes a mistake and everything's dead on a Friday afternoon.
A law need to be passed that anything critical must not be done on a friday afternoon, only on a moanday morning (or, even better, tuesday morning).
This will allow world+dog techies to collect their beer at pub o' clock and not having to waste their friday afternoons wrestling with issues some daft ******* caused in the first place.
With servers as pets, you could only fuck them up one at a time, and you might be notified of the problems you'd caused before you fucked up many of them.
With the automation tools for cloud servers you can fuck up the entire lot in one fell swoop. This is a fantastic productivity gain.
Biting the hand that feeds IT © 1998–2019