Virgin Blue has fingered Texas Memory Systems as the cause of its 21-hour airline reservation system crash in Australia. The airline's reservation system crashed at 8am on Monday in Australia. The cause was a hardware failure in the computer set-up running the New Skies Navitaire software, which was hosted by Navitaire, an …
Navitaire has a solid reputation ...
in the smaller airline business and it's popularity is born out by the number of carriers using it,
Surprisingly some of the larger res systems use banks of relatively small computers to deliver what they call service. Mind you, travel agents still have to wait wait weeks for some back office services. One of the three larger res systems still diddles carrier charges and forgets to credit agencies with their full commissions.
The problem is most likely the "Accenture" bit
Seeing as Accenture was involved, that's going to be the real "root cause" rather than any specfic implementation of tech.
Seen them screw too many pooches, then finger point elsewhere. Which they're doing again here.
That is all.
Could someone explain him what is enterprise escalation means?
The most obvious thing to wonder about is the details of who and how the out sourced system is managed, do they have sufficient calibre of staff?
But more fundamentally, from our experience, is did anyone actually test/simulate hardware failures on the system before deployment to find out (a) if the system properly detected/handled the errors properly, and (b) to verify the procedure to recover the system both exist and actually work.
The fact a vendor tells you they are fault-tolerant, can fail-over to a backup/cluster member, blah-blah-blah, counts for SFA - test each step yourself!
Navitaire isolated the failure to the device in question quite quickly, but the decision to repair the device proved "less than fruitful and [it] also contributed to the delay in initiating a cutover to a contingency hardware platform." A failover process that should take around 90 minutes took the best part of a day.
There's your problem right there. Okay, hardware failures happen, we know this, that's why you build a failover / warm standby / whatever environment so that WHEN the completely unexpected unplannable for failure happens it's not a complete disaster.
Given that you've already got the failover environment, when the dreaded happens, USE IT! No point having it if you don't use it. Muppets!
(speaking from experience when our live storage decided to wipe all its config... 2 hours in we decided that a repair would take too long and switched all services to the DR site).
Run by a subsidiary of Accenture
In the commercial world this is likely to loose them the contract.
Good thing governments are *so* forgiving of these little glitches.
- NASA boffin: RIDDLE of odd BULGE FOUND on MOON is SOLVED
- Pic Mars rover 2020: Oxygen generation and 6 more amazing experiments
- Microsoft's Euro cloud darkens: US FEDS can dig into foreign servers
- Boffins spot weirder quantum capers as neutrons take the high road, spin takes the low
- Plug and PREY: Hackers reprogram USB drives to silently infect PCs