Reply to post: 5 Nines

The biggest British Airways IT meltdown WTF: 200 systems in the critical path?

Anonymous Coward
Anonymous Coward

5 Nines

Coming up with a 5 Nines design isn't that difficult.

The problem is that once these things are installed and running someone usually buggers about with them. My good lady used to be the liaison engineer between a telecoms company and an IT supplier. The telecoms company had loads of HA applications running across as large number of clusters. When she took over the account she got worried about some of their admin practices and persuaded the company they really needed to have an audit done to see what their chances of surviving a failure were. To start with the company wasn't convinced, all the clusters and all the applications were tested and signed off when they went live*. Anyway they agreed when she said the IT supplier would pay for the audit. The result? There wasn't a single application left which would switch over automatically in the event of a failure. Everyone of them had been buggered up by people making quick online changes without understanding the HA implications of what they were doing. HA is not something you can buy off the shelf. It's mostly a mindset.

(*) Most HA projects I've been involved with haven't been tested properly.

Most projects over run.

The testing comes at the end

So when things are late the pressure is put on the time for testing. Management would rather see the project brought in on time than having everything tested properly. So once the application seems to work, they want to go live NOW. The managers then hope they'll get promoted (as a reward for being on time) out of the way before anything breaks.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon