back to article Visa fingers 'very rare' data centre switch glitch for payment meltdown

Visa has said a “very rare” partial network switch failure in one of its two data centres led to the fiasco earlier this month that caused millions of transactions in Europe to be declined. The outage, which lasted for about ten hours on Friday, June 1, sent panic among European pub-goers, as apparently about 10 per cent of 51 …

  1. Anonymous Coward
    Anonymous Coward

    Failure rate

    The numbers they quote are nonsense, since a number of large chain retailers gave up and simply put signs up saying "cash only". The true failure rate would be much higher because it would need to include the "unattempted but originally intended" transactions. Like my Friday night shopping.

    1. Annihilator

      Re: Failure rate

      By the same token they won’t include the successful transactions that would have been in the “unattempted but originally intended”, so it is probably safe to assume similar failure rates.

  2. Anonymous Coward
    Anonymous Coward

    Unreserved Apology

    Does anyone else remember the good old days when the only place one found an unreserved apology was in a resignation letter?

    1. Anonymous Coward
      Anonymous Coward

      Re: Unreserved Apology

      "Does anyone else remember the good old days when the only place one found an unreserved apology was in a resignation letter?"

      Not me, no. Can you put a specific date on those "good old days" and show actual real-life examples, rather than vague, nostalgic misremembrances?

  3. Anonymous Coward
    Anonymous Coward

    How's that Cashless-society looking now Big-Tech?

    ~~~

    Sales Pitch:

    ~~~

    https://www.rte.ie/news/business/2018/0612/970015-moneyconf-dublin/

    ~~~

    Versus Reality:

    ~~~

    https://www.bbc.co.uk/news/business-43645676

    https://www.bloomberg.com/news/articles/2018-02-18/-no-cash-signs-everywhere-has-sweden-worried-it-s-gone-too-far

    https://www.bloomberg.com/view/articles/2018-06-11/maybe-dollars-should-be-digital

    https://www.irishtimes.com/opinion/letters/why-cashless-society-is-a-dangerous-idea-1.2869212

  4. Doctor Syntax Silver badge

    If it was the backup switch then presumably the primary has already failed unless the backup was firing out packets that interfered with the rest of the network.

    1. diodesign (Written by Reg staff) Silver badge

      "If it was the backup switch then presumably the primary has already failed unless the backup was firing out packets that interfered with the rest of the network."

      It was a backup switch within the primary center that failed to activate due to a component fault in another switch.

      C.

      1. Brewster's Angle Grinder Silver badge

        "It was a backup switch within the primary center that failed to activate due to a component fault in another switch."

        So I guess it was the backup switch's psychic circuit that "failed". It should have ignored the status signals telling it the main switch was okay and deduced it was having a bit of a turn.

      2. earlyjester

        I remember from days gone by of a certain companies switch that had an issue with memory and that if it failed it was left in a state where it could answer to a poll to say it was functioning but it wasn't. This stopped the backup from taking over and all traffic being serviced.

        The actual details fail me. It would be good to know the manufacturer and age of the switch.

    2. Nick Stallman

      Partial failures like that typically mean the connection can no longer reliably carry traffic, but it still thinks the link is online so it never enacts the fail over procedure.

      So no prior failure is required, just the monitoring being told that something is up when it's actually down.

      These extremely rare failures actually happen all the time. Earlier this year servers I manage were also knocked offline by a partial failure which prevented automatic fail over.

      1. John Riddoch

        Yup, partial failures suck. I've seen a fibre path fail just enough to bugger up service but not quite enough for the OS to figure it needed to fail over to the 2nd path. Once we'd figured that out, it was just a matter of disabling the primary path and everything started working normally.

  5. Anonymous Coward
    Anonymous Coward

    VISA Crimes

    "Visa is migrating its European processing onto its global system, VisaNet - a process that is due to complete by the end of 2018."

    Ouch, will be traveling then... Better have a viable backup plan. Anyone notice Visa credit card charges going up in the past 2-3 years? Foreign exchange spreads etc... Or the difference between what we have to pay and what the mid-market commercial FX rate is (as found on sites like XE / Bloomberg etc). The spread seems to have doubled / trebled. Anyone know why that is? Withdrawing cash overseas got a lot more expensive!

    1. TheVogon

      Re: VISA Crimes

      "Anyone notice Visa credit card charges going up in the past 2-3 years? Foreign exchange spreads etc..."

      If you mean interest rates as well as FX rates then those are controlled by the issuing institution, not VISA. And yes with historically low interest rates, those have presumably risen to compensate.

    2. katrinab Silver badge

      Re: VISA Crimes

      The actual interchange fee that Visa charge your retailer's bank is the same as Mastercard, and has gone down in recent years. What your bank charges you, that has nothing to do with Visa.

      The foreign exchange spread is about 0.1% higher than Mastercard, and Mastercard charge very close to interbank rate. Your bank might charge another 3% or so on top that, but that is down to them, not Visa. If you are paying in a foreign currency, use a card that doesn't add a margin on top of the network rate, and yes, Mastercard is better than Visa in this particular case.

    3. Anonymous Coward
      Anonymous Coward

      How do you get a menu or breakdown of all the actual charges?

      The thing is, my bank blames Visa. Visa blames intermediaries and so on. Its a vicious cycle. All I know is, I've been keeping a list of charges for about a decade. You have to, if you travel a lot, because of currency fluctuations its impossible to figure all of this out when you're back home.

      However, in the past 2-3 years especially I see a definite increase, an added 2-3% extra hit. But how do I find out whose got their hand in the cookie jar? Its like SWIFT banking... Do a few of those with reversals and try to actually figure out, who got what. There's no documentation!

      1. katrinab Silver badge

        Re: How do you get a menu or breakdown of all the actual charges?

        Here is an example:

        €100 converted into £ on 15th June 2018 using the following exchange rates:

        Bank of England - £87.39

        Mastercard - £87.55

        Visa - £88.22

        Your bank will take either the Mastercard or Visa rate depending on what your card is, and they might add additional charges on top of that. 3% margin + a non-sterling transaction fee + if relevant a cash withdrawal fee is quite common

        Visa are taking 83p (Mastercard take 16p). Your bank may well take another £5 or so, but that money does not go to Visa.

      2. Uberior

        Re: How do you get a menu or breakdown of all the actual charges?

        Surely if you do travel a lot you'd have an fx free Mastercard or an Amex International Currency Card?

        I certainly wouldn't be using a Visa, unless either Mastercard or Amex had limited acceptance.

    4. Stuart Moore

      Re: VISA Crimes

      I recently got a metrobank debit card for a trip abroad, and it made life a lot easier. No fee for transactions abroad, and they're a mastercard debit. I like having one each of visa and mastercard, with different banks, just in case this happens.

    5. monty75

      Re: VISA Crimes

      Try Revolut - mid market exchange rates and no fees for most casual users.

  6. John Robson Silver badge

    Global next

    "The firm has launched a number of reviews and is also in the process of migrating its European systems to a more resilient global processing system, VisaNet."

    Great - now we can stop processing transactions all over the world instead of just over here...

  7. Anonymous Coward
    Anonymous Coward

    VisaNet

    The machines rose from the ashes of the economic fire. Their war to exterminate cash has raged for decades, but the final battle would not be fought in the future. It would be fought here, in our present. Tonight.

    1. StuntMisanthrope

      Re: VisaNet

      Sounds exciting, bah dum bum dum! However, got this this strange feeling that they've blown it and cash is king after all. #medicischool

    2. Sgt_Oddball
      Terminator

      Re: VisaNet

      It can't be reasoned with, it can't be bargained with...it doesn't feel pity of remorse or fear...and it absolutely will not stop.Ever....

      Unless it has a wonky switch as it turns out....

      (Still waiting for my Phased plasma rifle in the 40 Watt range.... wonder if they'll take cheques?)

  8. David Neil
    Mushroom

    Oh they are asking EY to have a look at the root cause

    The same place that is dumping all it's own IT off to TATA as fast as it humanly can

  9. cantankerous swineherd

    istr a Charlotte Hogg leaving the bank of England under a cloud? nice to see she's managed to get another gig if so.

  10. Anonymous Coward
    Anonymous Coward

    So if I've read correctly the fix was to turn it off and on again?

    1. phuzz Silver badge

      Yes, but the important part was knowing exactly which component to switch off.

      1. Ken 16 Silver badge

        and having the balls to approve doing it - that probably took hours of buck passing

  11. Keith Oborn

    Another case where regulators should require a detailed public report

    As per TSB and the BA failures. Where is the regulatory requirement that a detailed analysis and report be made available to all relevant bodies (all equipment and component suppliers, all their customers, all end users of the service and relevant regulators).

    Contrast with a major aviation accident. The entire industry gets told the full details, is required to make recommended changes, and the details are available for scrutiny by any interested party.

    Until the finance and it/networking industries are held to these standards, we will continue to suffer this sort of failure.

    One positive mark to Visa though, for at least offering a superficial but reasonable explanation with little delay.

    1. Herring`

      Re: Another case where regulators should require a detailed public report

      Wouldn't that be a cool job? A sort of Quincy M.E. but for systems. Diagnosing what's wrong with other people's processes and practices would be much more fun than being trapped in your own.

  12. David Roberts
    WTF?

    Still not understanding

    Why it took so long to disable the failing switch once it was identified.

    Assuming that if the switch had completely crashed the backup would have taken over, then why not just turn the damn thing off?

    Unless assumptions were made about the maximum size of the backlog/queues which could build up during failover, and the system just wasn't sized to recover from a massive backlog due to an undetected partial failure.

    This does sound quite likely, as the report talks about clearing out queues before switching to the backup switch. Perhaps the system couldn't recover if transactions were more than a certain age? Although you would expect that old transactions could be assumed to have failed (as was the case here) and been automatically recorded then purged.

    1. Joe Harrison

      Re: Still not understanding

      Why not just turn the damn thing off? The guy who knew how it worked and would have turned it off and on again has been made redundant unfortunately. His function has been right-shored to another time zone and the change control procedure for such a drastic action takes many hours to escalate through 25 levels of management in four countries..

      1. Korev Silver badge

        Re: Still not understanding

        I've no idea why you were downvoted (apart from "rightshore"), that sounds depressingly plausible.

    2. SImon Hobson Bronze badge

      Re: Still not understanding

      Why it took so long to disable the failing switch once it was identified

      As already said, the guys that wold have been able to diagnose this AND do something about it have all gone. The people running it now will probably be junior techs on a different continent with a) manglement imposed limits on authority and b) culture imposed limits.

      The latter is important. For many of us in northern Europe it's seen as a good trait to be able to sit down, look at the evidence, and formulate a theory as to what is wrong - and formulate a plan for how to fix it. So as already said further up the comments, a good ops team would probably have had it fixed before many people realised there was a problem.

      But AIUI, in many of the places such functions are offshored to, there is a different culture - where individualism is frowned upon, and the techs are supposed to "just follow the flowcharts". In such a culture, to get the offending switch powered off would require the problem passing up many manglement levels, endless meetings, and above all - discussion of who takes the blame.

      A secondary factor is the modern disease of not supporting people to make decisions. So even if a techie did realise that "all it needs is to power cycle this switch" - it's a very secure person who can take on that decision and expect his manglement chain to support him in doing so. More normally, the "safe" option is to do nothing - it's not your fault the system failed. But go and do something that should fix it, but for some reason doesn't - well your head is on the block for doing it.

      Go and read some of the "the day I ..." stories in ElReg - and in particular the comments. Some of the best ones involve the person "doing something" but being supported by their managers on the basis that "the only person who never made a mistake was the one who never did anything".

  13. 36bells

    Cisco Cisco Cisco

    This is the same rogue packet that has been travelling the world taking down Blackberry, heathrow, and now visa. Only appears on Cisco switches

    1. nowster

      Re: Cisco Cisco Cisco

      "GNU Terry Pratchett"

  14. RobertsonCR7

    One in a million

    This sound like a good scenario for a movie

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like