Streaming video provider Netflix has released Chaos Monkey, its homegrown tool that's designed to boost the resilience of cloud-based applications in the bluntest way possible: by knocking them down. "Do you think your applications can handle a troop of mischievous monkeys loose in your infrastructure?" asks Netflix's Cory …
Is this by way of apology
for the state of the Netflix experience last night?
I was browsing through a few titles last night after the Olympics stopped and the performance was a shocker. If Netflix was a cyclist they would have been relegated. But maybe they were just having a monkey party?
Bootnote: At least Netflix got to the chance to perform. LoveFilm doesn't even make it to qualifying.
I clicked on this story thinking it was about a new build of Ubuntu...
Looks like a really useful set of tools, but...
a) how long before they get used maliciously?
b) being free to aquire, it's a lot cheaper than the current encumbent product, Fuckwit Monkey, which has a dependency on the non-FOSS "Salary" feature, HOWEVER, if used inconjunction with said encumbent, it is potentially much more expensive from a TCO point of view than either product used in isolation
Really, this is a smart move. Any programmer, system designer, etc. knows that their code must be tested. But when it comes to distributed systems, all to often the "plan" now is to write up some failover code and hope it works. Even the likes of Amazon themselves clearly get this wrong (since they've already had a time or two where a localized failure caused wider-scale cascading failure.) This can allow for a failure at a known time, and while someone is looking closely at the logs to make sure not only that it works, but that it's working the way it's intended to, isn't driving load up dangerously on remaining systems (before a spare instance can be spun up) and so on.
It doesn't bypass security, it just randomly shuts down virtual machines you already have complete control over. The worst that could happen is that a programming error allows the user of the tool to shut down too many servers.
Well I suppose the very worse thing would be a security flaw which would allow a 3rd party to hijack the tool while its in use.
- Opportunity selfie: Martian winds have given the spunky ol' rover a spring cleaning
- Spanish village called 'Kill the Jews' mulls rebranding exercise
- NASA finds first Earth-sized planet in a habitable zone around star
- Reddit users discover iOS malware threat
- Pics R.I.P. LADEE: Probe smashes into lunar surface at 3,600mph