A botched Oracle upgrade last Sunday has denied access to an online trading system used to sell space on the UK gas pipeline network. Fax machines have been running hot all week as every transaction has had to be entered manually. The Gemini system at the centre of the affair doesn't control movements of gas, nor is it used to …
Guessing they would rather pee of customers and hack throught the faults than do a roll back, or did someone forget to load the tapes?
Lucky we don't deal in Gas. Must be utter chaos. 5 days and they havn't recovered yet? That's just maddness. I hope they arn't in charge of electricity trading.
business continuity plan?
Wonder if this xoserve mob bothered to think about putting one together? Seemingly not, otherwise it would have identified being able to rollback as a fallback when Oracle, M$ or whomsoever's 'upgrade' poos all over their heretofore stable system.
Thankfully, in this case, it seems the business is forward capacity planning rather than must-be-done-Now transactions.
Dev, Test, UAT and Production???
Whilst not all systems warrant it, most systems that are revenue generating, or client facing have, (or should have) an environment that mimics the production environment so that software patches or upgrades can be tested.
The creation of such environments however do tend to increase costs significantly, so you'll probably find that either there was no such system in place due to cost cutting measures or budget holders. Or the costs were such that the business decided that such outages were operationally acceptable when weighed up against how much an outage may cost and how much the test (or fail over) systems cost.
At least they had some BCP in place even if it relies on ancient technology (Who said that Fax was dead)
"Unbreakable"...? Can we have an Ellison icon too, please?
This is a production system where they were upgrading a system in place.
As I agree with your point that there should have been a parallel system in place, Dev, Test, UAT and production would have been a bit of overkill.
What is scary that they don't have a redundant system running in parallel that when they upgraded one machine, they could have failed over to the backup and either fixed the migration or reversed it.
But then again, they're running Oracle and not IBM's IDS (Informix). ;-)
I used to work somewhere that did the pumping of gas and they had a lot to do with the trading as they needed to know how much to pump and to where.
They must be having a hell of a time...
On the down side when ever things like this happen the price of gas goes up.