What is this Huawei of which you speak?
(AC because I work for an 'oligopolist')
O2 has fixed its poorly mobile network, so now everyone can start asking what went wrong and what the company is going to do about it. O2's press office isn't responding to queries. However, our understanding from various sources is that the day-long outage was caused by the transition of subscribers' details to Ericsson's …
What is this Huawei of which you speak?
(AC because I work for an 'oligopolist')
Second largest network provider after Ericsson? What do you do there, coffee?
World secret services and armies are at alert because of their gigantic size and fsck ups with their only rival, Ericsson.
It's what gets chanted at the start of "The Lion Sleeps Tonight", isn't it?
@ AC - Well played sir. Sand bucket hats all round.
"What is this Huawei of which you speak?"
It's the 50th State, just to the north of Polynesia isn't it?
The problem with outages like this is that while they might cost the operator a little cash and repetitional damage almost all the pain is felt by the end users.
I'd just love Ofcom to insist on DR via roaming on to other UK Networks. I can't see a big issue, just have other operators flip the switch on roaming when a network is down for a pre-defined time or pre-defined scale and oblige the faulty network to carry the roaming costs while they're down.
Seems like the perfect lever to drive up network resiliency to me - pass the cost of poor redundancy and network management onto the networks rather than users!
Which doesn't work when the network that you roam onto cannot validate your phone with your home network 'cos their directory is down
There are two failures here:
- lack of redundancy in the CUDB system
- lack of a back out plan from a failed update
I fully agree with you - it is long overdue.
However, it is not as easy as it seems. There is a list of forbidden operators in most SIMMs which is used to reduce signalling load from rejections of "pesky competitor customers" roaming onto your network. This is amended further by the more "positive thinking" preferred operator list present in more modern devices as part of operator customization.
Updating these is a major undertaking and may require re-issuing SIMMs for customers which do not have a phone where the operator can manipulate the lists remotely. So it definitely will not be a "flip a switch". More like "painful 12 months of of network transition". It is doable though - EE showed it as possible.
It is well worth it for a completely different reason. It removes the last excuse for keeping copper voice as a "universal service obligation". This will do more for "Broadband Britain" than any government money because it will allow "data only" connectivity at a regulatory level. It is also one reason why it is not being done. Crow does not poke a crow in the eye - so do not expect Ofcom to chop the head off one of the last BT cash cows even if it is very good for the country. That is not the way things are done in the UK :(
Just to say, it's 'SIM' (Subscriber Identity Module), not 'SIMM' (Single Inline Memory Module) in this case. The former is what you have in a mobile, the latter is what you had in a PC 20 years ago.
Your idea is not to be sniffed at, but wouldn't have helped in this case as it was a Core element that was affected and would not have been able to authenticate the users to roam.
It's be nice, but alas in this outage, the trouble wasn't connecting to the network, it was doing anything useful once that connection had been established. O2 would have had to take the entire mast network down in order for phones to roam to any other network (should Ofcom bend their ill-advised competition rules to allow it in such cases, as suggested by the OP) and that would have affected 100% of customers, not just a third... As a footnote, I'm fairly sure the nature of this issue will have meant emergency calls would not have routed via roaming as is usually the case. I hope no-one died.
If only things were that simple, National Roaming at the flick of a switch... Trust me it's a little bit more complicated than that
In the UK emergency calls on the other operators networks will not work
You'd cause a cascade failure.
No operator is going to carry enough spare capacity to handle the totality of another network failing because it would pretty much double the cost of the network and so double the price of service to the customer.
Without that spare capacity, a failover would trigger failure on the network being used for DR - taking that network's customers down too. Both sets of customers fail over to network three, which again can't cope and then you're left with no networks functioning at all in a pretty short space of time.
It's far better to leave a set of users broken than to arse about with ill-thought out DR solutions that leave everyone with degraded or zero service.
When you say "I can't see a big issue" what you mean is "I've not really thought this through".
That isn't what my sources have said who work for OSS suppliers and major operators involved... Quoting:
"Straight from the O2 horses mouth, it looks like Huawei cocked up an upgrade to the CUDB which is the centralised clustered HLR that Ericsson sold to O2 about 2 years ago. Quote "talked to some guys in Guildford and they told him that Huawei tried to do an update on the CUDB and they didnt configure on the cluster which is the master and which is the slave.""
Now, we know that this can always go wrong, look at RBS' problems recently, but a cluster (certainly all the ones I've worked on) will work out itself which nodes are the active and inactive, it's kind of the point of a cluster. I think they may have separated sites without ejecting the DR site's nodes from a geographically spanned cluster resulting in a split brain scenario.
My sources said it was some guys in India who failed to backout an update properly...
If the HLR or its equivalent function is seriously stuck up a creak looking for a paddle the chance of roaming is likely to be hit hard. I doubt that there would be an option to roam to another network without a good HLR function. If you cannot prove a SIM is active and that the handset is good, how can you manage to confirm the billing authority and allow a billing event?
I guess that's a risk the offending network would have to be obliged to take - like when the banks had to take the hit for Cheque Card misuse...
Is the SIM O2? Check
Are O2 on the hook for costs? Check
Off you go lads use some data, make some calls, O2 are picking up the tab!
Maybe needs a bit of front end infrastructure though to automatically bypass the HLR and route numbers to the other networks VLR?
There's a filthy joke in there, somewhere...
Idiots ... shall be taking my business elsewhere.
I'm expecting a torrent of up votes on this one but it does need saying.
These two outages really do show the false economy of tech headcount cuts and it feels as though the rubicon has been crossed.
Quite how any large business can get more and more reliant on IS and at the same time cut and cut at support and investment astounds me.
But then I'm not a CFO of a large multinational business so what do I know?
Is it a false economy though? You'd have to compare the revenue lost from the service outage with the savings made that led to the failure - if it was down to a cost saving exercise.
Businesses don't cut costs for a laugh, or even for management bonuses, they do it because the market drives prices ever lower and if you can't cut your costs to reach the market rate you lose all your customers. All the time people by mobile services based purely on price, this will be the result.
I don't expect compensation for consequential loss - however, they are charging me a monthly fee and for one day of the month I did not receive an acceptable service. I expect 1/30th of my monthly fee to be taken off my next statement. Is this not reasonable ?
Yes, if they're also charging you extra when the month has 31 days.
Not only is it reasonable, it's the minimum you can expect.
Agreements for sale of goods and services require that said things are actually provided! This particular parrot, i.e. the 18 hours+ of parrot which didn't arrive, isn't so much an ex- as a non-parrot!
What got me is the article mis-stating "ordinary customers will demand compensation, but O2 has no obligation to provide any."
Indeed no obligation exists by definition, because individual consumers of large companies' products tend not to pre-emptively negotiate damages clauses into their own contracts!
Now, that an obligation could materialise as a result of a successful CLAIM for damages against O2, that's far more likely - albeit something determinable only on the merits of any individual claim. The power of O2's disclaimer's of liability in the contract is debatable: it's possible these would be void clauses due to falling foul of the rules in the Unfair Terms in Consumer Contracts Regulations and Unfair Contracts Act, although, it depends on the merits of any individual claim.
Well it's all doom and gloom on here, eh? Why is everyone so quick to point the finger at cost savings/off-shoring/out sourcing when the reality is the cause of the outage is not yet public knowledge?
Do people actually understand what happens when the network management is outsourced? O2's network management may be done by Huawei, but that just means the engineer that used to work for O2 now works for Huawei. Same bloke, different employer. He's not become some incompetent, johnny-foreigner overnight.
And while everyone's moaning about "single-point-of-failure" and "not enough redundancy" etc, maybe they should consider the fact that mobile network's are incredibly large and complex beasts. While I'm in no way excusing what's happened, the fact that it's so rare is a testament to the way these networks are built and managed.
"a testament to the way these networks are built and managed."
I'd be tempted to change that to "were built and managed", and reserve judgement on the present and future competence for which evidence would seem to be lacking.
i remember the "single point of failure" complaints made when Vodafone had a similar outage last year.
I will repeat what one person said....
if you are that concerned about single point of failure, why aren't you carrying around an O2 and an Orange phone then?
network down for a day = a day of getting work done.
Dual-Sim mobile phone. They are readily available and can work on entirely seperate phone networks if the best possible resilience is required. Having dual sim cards also has the advantage that if one network is providing poor coverage in the area you are in, the second network might possibly give a usable signal to allow for making/receiving calls.
Good point but the twin-sim phones are mostly all non-smart (except for the Chinese knockoffs and they're just scary).
Or simpler, an unlocked phone...pretty much all can be unlocked for <£10, and payg sim on another network costs nothing.
Been on O2 pre-pay for about 6 years, never really had a major problem, even this last week.
I have always considered mobile communications a privilege, not a right.
It's crazy what a few people do without a mobile for a day, I wonder what they would do if T.V. broadcasting went down for a week? (Not owned a TV for 6 years now)
If my leccy/Internet went down for more than an hour? I have a few months of unread books to keep me satisfied.
After all, you pay for those services to be delivered, so if they don't provide them you should be getting a refund.
You also clearly consider them to have value, as otherwise you wouldn't be dropping ~£100 a month on them.
@Richard 12 - you don't pay for them to be delivered 24x7x365 with zero downtime. Even 99.999% guaranteed service would cost you a lot more than the mobile networks charge.
Do you demand a refund from your standing charges on lecky when you get a power cut? Even for something as critical as that, it needs to be out for 18 hours before you get anything.
Do i demand a refund on my leccy in the event of a blackout? Yes, yes i do, especially if i lose a freezer full of food becuase of the sh*tty infrastructure my leccy provider has in place :/
In a slight tangent; PlayStation only offered what they did because if a mass lawsuit was brought against them it would ruin them. Therefore O2 should technically refund it's users for the time the network was unavailable. They refunded me the cost of some unsent SMS messages a few months ago when the SMS portion of the network failed. I didn't even make a fuss because I was unaware it had happened :/ What's different about this situation?
You're only eligible for recompense if the contract you freely and willingly entered in to states that such remedies are available. Does it?
@takuhii - that's what insurance is for. Is a tree falling on your power line "sh*tty infrastructure"? Do you really expect fully redundant paths of leccy into your house? Can you claim the same refund if you happen to be running a cryogenics bay in your house?
As for "what's different", all your examples were voluntary gestures of good will from the suppliers. They are obliged to do next to jack, other than a refund of the money you spent (but usually after a fixed amount - broadband is 5 days before contractual refunds kick in for example).
If you want to add in requirements for damages to a contract, you have to add it in up front, largely in order for the supplier to cost the contract accordingly. That's contract law and it exists for a reason.
This outage coming just a few days after the 10 hours almost complete black-out of Orange in France looked a bit suspicious already. But now they provide the same explanation: customer DB failure with no apparent usable online backup of the DB. Weird, really.
My company has just spent millions kitting out most of our 8,000 frontline staff with new iPhones on the O2 network, only to find they couldn't communicate with any of them. Many of the staff who got these new phones with limited personal use, binned their own personal contracts. Cue dozens of Tech's acting like their left nut had just been cut off.
I shouldn't have gloated to them, but my Orange handset was working fine.
You're right, from my personal experience you should never gloat about being on Orange... ;)
@Andre Carneiro - indeed, although I do love the inevitable schadenfreude that seems to inhabit some people as if they made a reasoned decision to avoid O2 and choose Orange/Vodafone based on some glorious insight. You know, like they researched all the major telcos' internal audit procedures and predicted who would be the most resilient provider.
oh no not again. last month i got job with bank and had a go at schedule admin. it not be going very well so i got new job at db admin. i dont think this computer is for me at all.
If your life is that dependent on being connect while mobile, have a backup mobile contract. Whinge. Whinge. Whinge. Shit happens, be ready for it.