fail...
Do anyone else agree, systems like these should be designed with high availability in mind? Am sure they make enough profit to invest in systems like these.
Stansted Airport is getting back to normal this morning after an IT failure caused the closure of internet and automatic check-in systems earlier today. Register reader Paul - our spies are everywhere - said systems at Stansted had been down since his arrival at 4am. He said he was at the back of a very long queue as staff …
Most systems are designed to meet an SLA. If their downtimes total less than that allowed per unit time in the SLA, nobody is going to spend more money. Even if the SLA is exceeded, it might mean some penalties for the month concerned - this will earn the Service Manager a good kicking but nobody will put any money into improving anything.
If this is the only unplanned outage this year they'll still have five nines. Even if it's not, the cost of those last fractions of a percentage point become progressively more expensive so it might not be worth it. How much extra are you prepared to pay for a ticket to avoid a 0.001% risk of standing in line for a few hours more, esp. considering that this is not the only factor that can lead to excessive queueing in Stansted? I thought so...
No, I am sure nobody in the entire history of the airline industry ever thought that high availability systems might be a good idea. It's a good job there are people like you around to point us in the right direction.
In fact if the facts are accurately reported and both internet and airport check-in were down then it is likely to be a failure in the back-end passenger services system of one specific airline. In general these are airline-specific rather than utilities provided by airports (although there are exceptions). These back end systems have typically four nines availability or better although there are variations between the mainframe systems used by traditional airlines and the .NET based Newskies used by Ryanair and others. Life gets more complicated when check-in is being managed by a ground handler rather than the airline itself although that isn't likely to be the case here as a ground handler's system failing would not have affected web check-in.
It is also extremely unlikely that anyone at the airport would have been in a position to "reboot" the system even if they had wanted to as check-in systems usually live in vendors' data centres that are often not even in the same country.