back to article 'Inexperienced' RBS tech operative's blunder led to banking meltdown

A serious error committed by an "inexperienced operative" caused the IT meltdown which crippled the RBS banks last week, a source familiar with the matter has told The Register. Job adverts show that at least some of the team responsible for the blunder were recruited earlier this year in India following IT job cuts at RBS in …

COMMENTS

This topic is closed for new posts.
  1. Thomas 18
    FAIL

    You get what you pay for

    End of line.

    1. hplasm Silver badge
      Devil

      Re: You get what you pay for

      In this case, bigger yachts for the execs, smaller UK payroll for staff.

    2. Anonymous Coward
      Anonymous Coward

      Re: You get what you pay for

      But what are you paying for?

      You're maintaining (or not) the investment you've already made in 1000 man years of intimate systems knowledge!

      (I know how many staff ran RBS Batch Services prior to the outsourcing).

      Bean counters however don't see it that way - they simply see a figure on THIS YEARS books.

      ALL companies should learn from this, ESPECIALLY those in Financial Services. The EU is going to push to split up banks, and there's going to be a substantial number of new entrants on the market. They need to learn quickly that running a bank is not the same as running a supermarket. A bank survives by the knowledge in the computer, not by the stock on the shelf.

      1. Richard Wharram
        Meh

        Re: You get what you pay for

        Having worked in IT both for large banks and large supermarkets I must point out that neither is like 'running a supermarket' in the way you imply. For instance a small glitch in your invoice processing system could quickly lead to nothing being on the shelves to sell.

    3. Anonymous Coward
      Anonymous Coward

      Re: You get what you pay for

      Really?

      As a taxpayer, what I mostly seem to get is shafted.

      1. Destroy All Monsters Silver badge
        Thumb Up

        Re: You get what you pay for

        Not to worry, a fresh container of bills-o-the-queen has just left the print shop.

        This should lube you up for a forthnight.

    4. This post has been deleted by a moderator

    5. Anonymous Coward
      Anonymous Coward

      Re: You get what you pay for

      Not quite true: I don't think RBS paid for a comprehensive trashhing of their banking platform, which is what they got.

      Serves them right for outsourcing IT though. It's always a disaster. Every time.

      AC because the company I work for outsources to India too, with the same level of incompetence resulting.

    6. Fatman Silver badge
      FAIL

      Re: You get what you pay for

      What else is new, some dammed exec needs a larger bonus.

      What would be `interesting` would be the fallout in the executive ranks for the lousy son of a b---- that decided to outsource in the first place. Serve his head on a platter.

  2. Anonymous Coward
    Anonymous Coward

    Major error?

    Was it sudo rm -rf /

    1. AndyS

      Re: Major error?

      The computer got a virus, which no anti-virus software could yet recognise. Luckily it was very easily removed. They simply had to delete System32.

    2. Anonymous Coward
      Anonymous Coward

      Re: Major error?

      I was under the impression it was an iseries box, so I cant see your command doing much :)

      1. I Am Spartacus
        Thumb Down

        iSeries?

        You mean that NatWest runs on a glorified AS/400, sorry, make the S/38.

        No, I think it will be a Series Z, which is really a 370 in a pretty frock and new OS. Bet its still running OS/270 with VM/CMS.

        As for CA-7, Computer Associate says, and I quote "reliably manage enterprise-wide workload processing" This gives a whole new depth to "reliably".

        1. Rameses Niblick the Third (KKWWMT) Silver badge
          Stop

          Re: iSeries? @Spartacus

          Any system is only as reliable as the people who use and maintain it. This is the technological equivalent of giving your car engine a good going over with a sledgehammer, and then complaining it wont start so it must be unreliable.

          1. Nigel 11
            Flame

            Re: iSeries? @Spartacus

            More accurately the equivalent of having your car serviced by a work experience sociology student who's there only because his benefit will be cut if he isn't, rather than an engineer of twenty years' experience who loves cars (who isn't there because the garage "let him go" to save a few pennies in the short term).

            1. Rameses Niblick the Third (KKWWMT) Silver badge

              Re: iSeries? @Spartacus

              I was working on the basis that the owner knows nothing about cars, hence the sledgehammer, but your analogy works for me too.

        2. Anonymous Coward
          Anonymous Coward

          Re: iSeries?

          When RBS took over NatWest, they moved everything over to the RBS system. Everything worked. They didn't lose track of money.

          That does suggest that the underlying system is solid. But that is no protection against humans doing the wrong things.

        3. Destroy All Monsters Silver badge
          Windows

          Re: iSeries?

          There is nothing wrong with an AS/400 or VM/CMS.

          Damn youngsters.

          1. I Am Spartacus
            Mushroom

            Re: iSeries?

            I agree. I especially like IBM mainframes, though personally I cut my teeth on MTS and then MVS/TSO.

        4. Aaron Em

          Computer Associates

          Well, hell, there's your problem right there. CA: where lousy software goes to be murdered, resurrected as a zombie, and unleashed upon an unsuspecting world.

        5. Ilgaz

          Re: iSeries?

          I bet they are written in assumption that the person knows what she/ he is doing and qualified for the job.

          Just like rm -rf won't ask you newbie questions like "are you sure?", you have root and it is assumed you know what you are doing.

    3. Simon 15
      Linux

      Re: Major error?

      Not quite, it seems to have been: sudo nohup rm -rf / > /dev/null &

    4. Roger Kynaston
      Happy

      Re: Major error?

      Nah. meant to type crontab -e but did crontab -r.

    5. RegGuy1
      Facepalm

      Re: Major error?

      Oh, that sounds an interesting command. Let me just try that.

      I'll be back in a minu... oh shit!!

  3. Jon Double Nice

    Have they tried turning it off and on again?

    Failing that, adding more RAM and an SSD can help to speed up a sluggish system.

    1. AndyS

      Re: Have they tried turning it off and on again?

      I suggest they visit http://www.downloadmoreram.com/ and get some decent RAM downloaded.

      Should get things up and running again in no time.

      1. A.A.Hamilton

        Re: Have they tried turning it off and on again?

        Well, I went to the downloadmoreram web-site - and find that I have to download 4GB of RA via my browser. You would have thought that they could at least have made a torrent available, hmmm? Or at least they could have made RAM into rar, probably halving the download traffic.

        That's quite important with VirginMedia: the zeroes are often too round for the fibre-optic cable; you could rotate them 90 degrees around their vertical axis of symmetry, but then they would like too much like ones. It's my I bet the ones would come down the wire quicker if they were also rotated 90 degreees, to look like hyphens.

        RBS should use advance critical thinking skills like this. Oh, wait....

    2. Anonymous Coward
      Anonymous Coward

      Re: Have they tried turning it off and on again?

      You used to be an EDS manager didn't you?

  4. Anonymous Coward
    Thumb Up

    Good work, El Reg

    Keep digging!

    1. proto-robbie
      Thumb Up

      Re: Good work, El Reg

      Indeed! The Reg is now the technical authority in this affair.

    2. Wibble
      Pint

      Re: Good work, El Reg

      You're showing up the rest of the media as a bunch of useless press-release advatorial writers. Proof that the average journo writes bollocks as doing tech was too hard for them.

      Well done.

      Have a virtual beer token on me.

      1. tfewster Silver badge

        Re: Good work, El Reg

        +1; but with one correction

        "Proof that the average journo writes bollocks as doing [select * from useful_careers] was too hard for them"

  5. Anonymous Coward
    Anonymous Coward

    no backup of the schedule?

    So did they have no backup of the CA-7 schedule?? certainly sounds like that is the case.

    1. Nick 6

      Re: no backup of the schedule?

      More likely they did take some routine backup which included the database but had never exercised a full recovery back to service of the application given this failure mode, followed by successful completion of the batch schedule.

      1. fixit_f
        Go

        Re: no backup of the schedule?

        I work in a large investment bank and am very au fait with scheduling of overnight processes, although we use control-m.

        We did have an instance where a global upgrade to control-m failed and brought the whole tool down. The trouble is, best practice is now that all batches are scheduled by many disparate systems, and the batches need to be "granular" i.e. each part of the batch runs to completion and is sanity checked by automated checkers before the next stage can run. Because of the insistence that this is done by schedulers, many teams struggle to run their overnight processes without knowing the scripts that are run, or the order they are run in. In the event that the whole system is down, this information may not be readily available. There are some right numpties on particular teams in any large institution, and they're often trying to unravel things designed a decade ago.

        I bet something like this happened here - the scheduler fell over, overnight batches were interrupted (potentially mid process where that process isn't rerunnable, requiring a database restore) and the teams with the knowledge to run these overnight processes manually may not necessarily have been on call or available.

        1. Jonathan Larmour
          Stop

          Re: no backup of the schedule?

          "and the teams with the knowledge to run these overnight processes manually may not necessarily have been on call or available."

          That's because they'd been made redundant months before. So yes, very unavailable.

      2. J.G.Harston Silver badge

        Re: no backup of the schedule?

        I remember having a Bank of Scotland (ie not RBS) Visa card in the mid-1980s. I remember something in the fine print about pre-programmed unavailablity at a weird time, like 3am on the 3rd of the month or something. In some of the promotional blurb they were good at researching back then it explained that they deliberately took the Visa processing base in a bunker in Kincardine offline at that time to do a regular recovery test.

    2. Anonymous Coward
      Anonymous Coward

      Re: no backup of the schedule?

      Not knowing how CA7 works, I'm not sure, but even if you could restore the batch schedule from tape, you'd have to be careful to ensure you don't lose the current schedule state, i.e. which jobs are still in the queue pending to run, which have run successfully, which may have failed, etc, etc, etc. It's not just a case of reloading from tape, I'd have thought.

      Of course, there should be some method of recovering batch jobs (primarily due to manual error, I'd imagine), whether it has ever been designed to cope with such a massive loss/corruption of the schedule is another matter.

      1. Lee Dowling Silver badge

        Re: no backup of the schedule?

        I should think that such a piece of software was built solely to solve such problems and, if not, then you should be leaving obvious markers as you churn through jobs and also have proper transaction rollback so you can "undo" a broken / incomplete job before continuing with its replacement (and even just rollback to before any of the jobs were run or any of the schedules deleted!).

        We shouldn't apply common server management functions to such large jobs, I think, but we should hold them accountable to a HIGHER level of control over such things.

        How did someone inexperienced get on the team?

        How did they get access to the schedule controls?

        How did they managed to delete EVERYTHING on there?

        Why did the software allow such deletion without confirmation?

        Why is there not a rollback or even versioning function for the schedules?

        Why, precisely, does one mess-up by one employee in front of one computer put your ENTIRE BANKING SYSTEM out of action, nationwide?

        1. Anonymous Coward
          Anonymous Coward

          Re: no backup of the schedule?

          "Why did the software allow such deletion without confirmation?"

          You seem to be assuming it was without confirmation?

          Maybe there was a language issue and the operative simply didnt understand the confirmation?

          1. Anonymous Coward
            Anonymous Coward

            Re: no backup of the schedule?

            This is a mainframe - the language was probably so convoluted even a native English speaker would struggle to figure out what he was being asked.

            1. QuiteEvilGraham
              WTF?

              Re: no backup of the schedule?

              From the CA-7 sysprog manual.

              Because CA-7 is controlling a production environment, backup and recovery of its database

              becomes extremely important. Backups of the CA-7 database should be scheduled

              on a regular basis, at least once each day. If possible, CA-7 should be down or at least

              reasonably inactive during the backup, with no permanent updates being made to the

              database. All data sets in the database must be backed up at the same time.

              Additionally, the backup procedure should be as fast as possible especially if scheduling

              is to stop. Two other concerns for backups are to produce a single source for recovery

              and, where practical, to provide error checking of index and pointer elements.

              With the above items in mind, you may find that no single utility satisfies all your concerns.

              On the one hand, the SASSBK00 program provided with CA-7 creates a single

              source file for recovery and performs error checking of index and pointer elements;

              however, it is slow for a large database. (It is slow because it creates a logical as well as

              a physical backup for conversion purposes and therefore produces many more records

              than a utility such as IDCAMS or CA-ASM2.) On the other hand, utilities such as

              CA-ASM2, IDCAMS, and DFDSS are fast and can produce a single source for recovery,

              but they have no error checking of elements.

              Seems pretty simple to me.

          2. Lee Dowling Silver badge

            Re: no backup of the schedule?

            Then I refer you back to the first question - how did someone who couldn't understand the confirmation get into a position where they were presented with it in the first place? And/or, what were they doing confirming it if they didn't understand it rather than CHECKING with someone else? And/or, what is your entire banking system doing hinging on the wording of a confirmation?

            1. Steve Evans

              Re: no backup of the schedule?

              The same way call centres work... Crib sheet.

              Press this, click that, just say "Yes" to anything it asks you to confirm.

              Now off you go.

            2. Anonymous Coward
              Anonymous Coward

              Re: no backup of the schedule?

              "Then I refer you back to the first question - how did someone who couldn't understand the confirmation get into a position where they were presented with it in the first place?"

              You're making a lot of assumptions there, matey. You appear to be getting all steamed up over a hypothesis.

              "And/or, what were they doing confirming it if they didn't understand it rather than CHECKING with someone else?"

              People. Fuck. Up.

              There doesn't have to be a conspiracy. There doesn't have to be a failure in process. Sometimes people just screw up, and sometimes there really is just a lone gunman. And I very much doubt that you will get the answer in a public forum.

              "And/or, what is your entire banking system doing hinging on the wording of a confirmation?"

              What; you've never accidentally deleted an entire file system by rushing an hitting a wrong key?!

              Maybe there were four phones ringing with people screaming for an update and an ETA on the fix. Maybe three managers were over his shoulder 'helping'. That's pretty distracting when you're trying to fix a major system outage in a FTSE100 company, in my experience.

          3. Uncle Siggy
            Flame

            Re: no backup of the schedule?

            You Brits really do like to pick nits.

            "You seem to be assuming it was without confirmation?"

            Would have been able to refrain from picking a nit if he had used the word "accountability"?

            "Why, precisely, does one mess-up by one employee in front of one computer put your ENTIRE BANKING SYSTEM out of action, nationwide?"

            The guy or gal attempting to pick nits with their contractors right now are probably feeling like they've been told to piss in the corner of a round room. The managers I have worked with never ever believe or see what in-house professionals tell them will happen, and never ever believe or understand when it has happened until it is too late.

        2. TeeCee Gold badge
          Facepalm

          Re: no backup of the schedule?

          1) Even experienced people drop a bollock sometimes. A colleague of mine had over 15 years experience of the systems concerned when he heroically deleted the entire environment for ${country}. Yup, a whole machine's application set and data down the crapper in one misplaced rm -rf *. Longest recorded ohnosecond in history.

          2) When you're in rolling back from upgrade mode, you tend to be playing with O/S commands and infrastructure utilities, not user software. There are many things around at that level in most environments which do stuff quite capable of ruining your day without asking. It's sort of assumed that the type of person allowed to play with them is allowed to use the sharp scissors. However, as we see in (1), even the best of us drop a clanger occasionally.

          1. Wensleydale Cheese
            Go

            Re: no backup of the schedule?

            @TeeCee

            "Even experienced people drop a bollock sometimes. A colleague of mine had over 15 years experience of the systems concerned when he heroically deleted the entire environment for ${country}."

            Of course experienced people make cockups. Backups and well tested recovery procedures aren't there just to recover from hardware failures, but human error too.

            I've never kept a tally, but my real life restores due to human error far outnumber those due to a hardware failure, probably by a factor of 100.

          2. Silverburn
            Thumb Up

            Re: no backup of the schedule?

            "heroically deleted"

            +1. My phrase for the week!

        3. Anonymous Coward
          Anonymous Coward

          Re: no backup of the schedule?

          To answer Lee Dowling questions:_

          Q1. How did someone inexperienced get on the team?

          A1. RBS got rid of all the experienced staff and outsourced their jobs

          Q2. How did they get access to the schedule controls?

          A2. RBS got rid of all the experienced staff and outsourced their jobs

          Q3. How did they managed to delete EVERYTHING on there?

          A3. RBS got rid of all the experienced staff and outsourced their jobs

          Q4. Why did the software allow such deletion without confirmation?

          A4. Computers do what you tell them, which more often than not is not what you want them to do, that's why you get experienced staff to do it. You can try this yourself, open a command prompt on you PC and enter "format c:" and answer "Y" to the question.

          Q5. Why is there not a rollback or even versioning function for the schedules?

          A5. There is but RBS got rid of all the experienced staff who knew how to do it and outsourced their jobs

          Q6. Why, precisely, does one mess-up by one employee in front of one computer put your ENTIRE BANKING SYSTEM out of action, nationwide?

          A6. RBS got rid of all the experienced staff and outsourced their jobs

          Simples

          1. Anonymous Coward
            Anonymous Coward

            Re: no backup of the schedule?

            "You can try this yourself, open a command prompt on you PC and enter "format c:" and answer "Y" to the question"

            Oh, really now? Quick Robin, to the CMD prompt!

            C:\Windows\system32>format c:

            The type of the file system is NTFS.

            WARNING, ALL DATA ON NON-REMOVABLE DISK

            DRIVE C: WILL BE LOST!

            Proceed with Format (Y/N)? y

            Formatting 286181M

            System Partition is not allowed to be formatted.

            Nope, turns out you were wrong.

            1. Lee Dowling Silver badge

              Re: no backup of the schedule?

              And in typical Microsoft programming, it does make you wonder why it doesn't check if it's a system partition FIRST before it even poses the question.

            2. Anonymous Coward
              Anonymous Coward

              Re: no backup of the schedule?

              From AC 26th June 14:57 to AC 26th June 18:49

              "Oh, really now? Quick Robin, to the CMD prompt!"

              Hook, line and sinker....

              Yes, that's right windows doesn't allow to to format the partition you booted from, in order to format the C: drive you need to boot from a difrent system partition. But then we all knew that didn't we.

              You've made an assumption, you assumed that what I described won't work, you also assumed that hadn't left out any lines like "boot from an alternative drive".

              This is how system cock-ups happen, don't assume things, check you are in possession of all the facts before you act, you made an assumption, FAIL, yours, you don't happen to work on CA-7, do you?

          2. jackSparrow

            Re: no backup of the schedule?

            In that case, I will put the blame on UK workers, for failing to do proper handover procedures.

        4. Anonymous Coward
          Anonymous Coward

          Re: no backup of the schedule?

          I don't think it was the scheduler faulted. Most likely that, some poor sod omitted vital step(s) from EOD runs / scheduling table got corrupted during upgrade.

          Have you ever worked in a complex environment where multiple systems feed data to each other? No I thought not! Even if you have recovered from the changes to the scheduler, you would have to restore the missing input feeds, massage header files etc.

          This has to be a concerted effort for all production systems up/down stream from the host the EOD batch runs from.

          Even under controlled DR environments with known input files it is usually 2 days effort with months of planning.

          To make the matters worse, if you realise you've missed something 2 nights ago, you have to do this concerted effort for every EOD run.

          So please, before critisizing the people who are recovering the system / transaction engage your brain.

          Recovering a stand alone box is so easy, I would not even shed a single bead of sweat for it.

        5. despairing citizen
          FAIL

          Re: no backup of the schedule?

          "Why, precisely, does one mess-up by one employee in front of one computer put your ENTIRE BANKING SYSTEM out of action, nationwide?"

          it's all about management controls.

          Whilst sometimes it is not possible to build in management controls to prevent a single operator nuking a system (but many times they are, but the can't be arsed to build them in), in this situation you use a checklist, do not think, do not use initative, follow the check list to the letter.

          There is a reason pilots use them.

          They work just as well for operations and application support.

          and if they can't follow a check list, you shouldn't have hired them for that job.

          1. Anonymous Dutch Coward
            Mushroom

            Re: no backup of the schedule?

            Unless there is no checklist, or the checklist is out of date...

          2. fixit_f
            Thumb Down

            Re: no backup of the schedule?

            It's not about "Management Controls" in fact this kind of wooly headed high level thinking is the reason why many investment banks are filled with "service delivery" departments while lacking the sort of teams that can actually deal on a technical level with this sort of crisis. Where I work we have a "service delivery" department and their sole function appears to be obstructive whenever we have an overnight problem. The end result is people who know what they're doing hacking the sh!t out of stuff to make it work, while the "management" tier do their best to make it look like they had any contribution to this process - and then they take credit. Vendor or in house, at 3 in the morning you need people available (IT and BA's and Devs) who understand the systems and can just make it process the OVN, This RBS situation is the perfect example - technical people within RBS probably knew exactly what needed to be done, but a load of managers probably demanded "process" and spent at least 12 hours relying on scheduler ops teams. For this reason several thousand "payday" customers (people who need their payday money) were skint for several days. All because of crap managers who talk more about "service delivery" and "process" I hate these wooly c***s what exactly do they do? As a senior member of a team who runs a front office system that shifts literally billions per day with base and precious metals I expect freedom where something needs fixing, and I've done nothing to demonstrate that I'm not capable of running that capacity. If you run with outdated systems, give the poor sods who keep it running absolute authority. I'm not Jerome Kerviel, nor are any of my hard-pressed colleagues, the bank I work for pay me to run these systems, and I will to the utmost of my ability - moreover if processing fails my bosses have my mobile number and at weekends I'll be the first to pitch in, on call or not. RBS didn't seem to have this escalation process from what I've seen.

            1. P. Lee

              Re: no backup of the schedule?

              Its service delivery's job to get in the way until everyone is satisfied that all the information is processed. They are basically a fallback because we all know the documentation is not up to date. It also turns out that "getting things working" is not worth risking other revenue/services for.

              The problem is worse with consolidated systems. I suspect more money is lost long term on management of consolidated systems than is gained by saving a few million on additional hardware.

        6. Fatman Silver badge
          WTF?

          Re: How did someone inexperienced get on the team?

          Simples: The met the (low) salary requirements.

          </sarcasm>

    3. AOD
      WTF?

      Re: no backup of the schedule?

      A more fundamental question to be asked here (speaking as someone who has provided Production support for several banks and seen most of the enterprise schedulers in action, excluding mainframes) is why on earth was the original CA-7 upgrade being performed during the working week?

      Changes of this nature should be performed at the weekend so there's some breathing space if things do go pear-shaped.

      Of course the fact that they went pear-shaped in the first place is likely down to inadequate testing/preparation for the upgrade itself combined with less than "expert" staff being used to actually perform the upgrade.

  6. Crisp Silver badge
    Flame

    RBS: "No evidence" this is connected to outsourcing

    Well of course you wont find evidence if you don't go out looking for it!

    This is worse than incompetence, this is 'bury your head in the sand' wilful ignorance. They made experience staff redundant, and when they gave those responsibilities to someone with little experience of the system, they cocked it up.

    1. Anonymous Coward
      Anonymous Coward

      Re: RBS: "No evidence" this is connected to outsourcing

      Indeed. But if he won't accept that it is the fault of RBS' inept offshoring in pursuit of a few shekels, then presumably he still believes that exporting UK jobs is a good way of rewarding the country that bailed out him and his fellow charlatans and incompetents. Maybe the BBC could ask him this directly, although so far their coverage of this has been rather lightweight.

      Anyway, let's take it forward that offshoring is a grand thing, in which case the obvious thing is to give Hester and his fat cat cronies an Indian salary, since they're so keen to run an Indian bank.

      1. Anonymous Coward
        Anonymous Coward

        Re: RBS: "No evidence" this is connected to outsourcing

        The statement is logically correct there is no evidence this is connected to outsourcing

        However as with all management bullshit statements it is not a full and true picture of events. Had Uncle Fester Hester said "there is no evidence this is connected to outsourcing to a company that employs the cheapest inexperienced people available in a country where staff routinely change jobs for small pay rises at 1 days notice", then I might have had a problem with what he said, not that he cares what I think anyway.

        To quote Rockhound (Steve Buscemi) in the film Armageddon: "You know we're sitting on four million pounds of fuel, one nuclear weapon and a thing that has 270,000 moving parts built by the lowest bidder. Makes you feel good, doesn't it? "

    2. Pete 2

      Re: RBS: "No evidence" this is connected to outsourcing

      Once is unlucky, twice is a coincidence. It's only after the third major crunch that someone would start asking questions.However, nobody in the bank would ever dare anything in public - there's too much hysteria about racism to open that particular can of worms

      Though the important question would be: Can Natwest's customers survive another 2 outages?

      1. Velv Silver badge

        Re: RBS: "No evidence" this is connected to outsourcing

        "too much hysteria about racism"

        This could/would have happened just as easily if it had been outsourced within the UK (EDS/HP, CSC, Atos, Logica, etc. It is about the loss of experienced staff.

        TUPE would move the experienced staff initially, but history shows they have little loyalty to either employer and find themselves new jobs relatively quickly. The supplier is then left recruiting inexperienced "cheaper" labour to fulfil the contract. THAT IS HOW OUTSOURCING MAKES A PROFIT FOR THE SUPPLIER. Outsourcing NEVER improves service.

        Rockhound in Armageddon: "You realize we’re sitting on 45,000 pounds of fuel, one nuclear warhead and a thing that has 270,000 moving parts built by the lowest bidder? Makes you feel good doesn’t it?"

        I wonder if the senior executives and accountants at RBS Group do their banking through Barclays?

        1. Pete 2

          Re: RBS: "No evidence" this is connected to outsourcing

          > This could/would have happened just as easily if it had been outsourced within the UK .... It is about the loss of experienced staff.

          Yes, that's what I mean. And if it had happened with UK outsourced staff the topic of overall skill level would be openly discussed in the press. However, I get the impression that the silence (article in The Daily Mash notwithstanding) on the question is because people are too scared to broach the subject for fear of being labelled - even if they don't have a position on in; one way or the other.

        2. KnucklesTheDog

          Re: RBS: "No evidence" this is connected to outsourcing

          "I wonder if the senior executives and accountants at RBS Group do their banking through Barclays?"

          I worked for Natwest in the 90s for a brief period and they would only pay your salary to a Natwest account (you may have realised that this effectively means they're not paying you at all until you remove the money as it never leaves the bank!). No matter what the rules now, I expect a lot of their own staff were screwed over by this too. Probably not at board level, I guess they'd have access to their ample offshore accounts...

          (The HR woman at the time said there was a time when not only would they only pay into Natwest accounts but they wouldn't allow you to have any other account aside from a building society savings account - your line manager actually had access to your statements to check you weren't moving out money to a rival bank!)

    3. Anonymous Coward
      Anonymous Coward

      Re: RBS: "No evidence" this is connected to outsourcing

      Outsourcing is not the root the cause of the problem. However, because outsourcing builds on the same flawed model already in place it is often the last straw.

      The underlying problem is that most enterprise IT works only because computers have been replaced by humans working like an insect colony -essentially an old school organic computer.

      In that sense you are correct, as the processes and knowledge exist only in aggregate at the level of the colony and only by continual crawling manually over the infrastructure and code base by the workers does the system keep working and the knowledge stay alive. Insert workers from another colony or hive off (pun intended) some of the work and the inevitable will follow.

      IT management know nothing about IT which is why manual process have been substituted for automated ones in previous decades. This process of reversion to Victorian practices is further compounded by the breaking of any meaningful academic foundation of enterprise IT in the 21st century.

      1. Titus Technophobe
        Stop

        Re: RBS: "No evidence" this is connected to outsourcing

        This looks like it must be true, just seen a comment on the Guardian web site by an RBS bod "The management and execution of the batch process is based in Edinburgh at Fettes Row, as is all the current work to resolve the problem.".

        All the work is now going on in 'Fettes row' in Scotland. That wouldn't be India at all then?

    4. Fatman Silver badge
      Flame

      Re: RBS: "No evidence" this is connected to outsourcing

      I have a one word answer for that:

      "BULLSHIT"!!!!!!!

      Hey Reg, can we get a bullshit icon, please!??

      Flames,`cause that's what they did to their bank (crash & burn).

  7. Nick 6
    Mushroom

    Investment in the backbone?

    Backbone? Guess they don't mean "having the guts to admit to making mistakes and having not properly understood the risks involved with outsourcing".

    This situation really does sound like someone pretty high up in the executive chain responsible for operations needs to be fired. And I mean "fired" rather than helped into a taxi holding a massive payoff cheque.

    1. Anonymous Coward
      Anonymous Coward

      Re: Investment in the backbone?

      In the unlikely event that anybody is fired, there's every chance that they will be (in relative terms) quite a junior scapegoat, and the real villains will continue to scrape large pay packets and larger bonuses.

      To my mind the bank's head of risk needs to go, the head of IT needs to go, the head of the India operation needs to go (and quite a few people below him), and possibly somebody from operations. But what's the chances that they try and pin it on, say, the head of UK retail banking, who would probably have little or no say on how the back office was designed, or the decision to offshore it?

      1. Anonymous Coward
        Anonymous Coward

        Re: Investment in the backbone?

        "In the unlikely event that anybody is fired, there's every chance that they will be (in relative terms) quite a junior scapegoat"

        The guy who hit 'Return', maybe?

      2. Anonymous Coward
        Anonymous Coward

        Re: Investment in the backbone?

        @AC12:28 The head of risk needs to go? I've worked in large banks all my career in settlements, risk, and trading. The head of risk would have had fuck all say in when the scheduler got upgraded and how it was done. In a normal bank with all its experienced staff grand plans would have been made with layers of sign-off. But even then the head of risk would have little say - business sign-off is often a case of "Will it break? Nah. Are you certain? Sure. Ok then". When the key men get booted out and the outsourcers arrive things tend to get handled a little differently.

        This was about lack of experience and cost cutting. That something so integral to the running of the business could be fucked up so monumentally is a can that will need to be carried by the head of IT (minimum) and potentially someone above him (Head of operations? Hester?). You cannot bring a bank to its knees for a week and just walk around with an executive equivalent of "shit happens".

    2. Nigel 11
      Mushroom

      Re: Investment in the backbone?

      Fired is inadequate.

      When an engineer wilfully neglects to design to the accepted standards of his profession and people are killed by the collapse of the resulting structure, he's likely to find himself facing manslaughter charges.

      The manager responsible for this almighty F***-up ought to be personally liable for the losses. All of them. Bankrupcy is the least that should happen to him. Jail would be better.

      Many Roman bridges and even buildings are still standing after two millennia of use including one of total neglect. This probably has something to do with the Roman approach to quality control. The architect was required to stand under the arches as the scaffolding was removed.

      1. Fatman Silver badge

        Re: manager responsible for this almighty F***-up ought to be personally liable for the losses.

        You know better, manglement will always escape punishment.

        They will always find a new job.

        To send the responsible damager on his way to that new position with a new employer, I suggest use of one of these:

        http://en.wikipedia.org/wiki/Trebuchet

        Put the mangler in where the rock would have normally been located.

        PULL!!!!!!!!!!

    3. Bilby

      Re: Investment in the backbone?

      So the long-term solution to a problem caused by replacing experienced people with inexperienced people, is to replace the current, experienced executives - who are now as full of information on the perils of employing inexperienced staff as the dog that peed on the third rail - with new, shiny, fresh out of their MBAs executives who will repeat the same errors made by their forebears?

      Surely a better solution is to fire no-one, but to re-hire (at great expense) as many of the experienced staff they originally sacked as they can find, while slashing the remuneration of the responsible parties, but keeping them in-house now that they have learned this rather expensive lesson?

      1. Juillen 1

        Re: Investment in the backbone?

        Good idea.. Though perhaps re-examine their renumeration, as their evidenced value to the organisation isn't what it was thought to be..

        Though following the chain, and finding out why the outsourcing decisions were made if the first place, who applied the pressure, and finding out why they applied pressure to something as obviously risky (we hear the cries every time there is outsourcing that something like this will happen; well, here we are). And definitely evaluate their value to the company and renumeration.

  8. Anonymous Coward
    Anonymous Coward

    Sounds like they need...

    more "Are you sure" dialog boxes.

    1. sugerbear

      Re: Sounds like they need...

      Or Sanjeet couldn't find the UNDO button...

      1. JetSetJim Silver badge
        Joke

        Re: Sounds like they need...Clippy

        "It looks like you are trying to f*** up the entire bank, would you like help with that?"

        1. Anonymous Coward
          Anonymous Coward

          Re: Sounds like they need...Clippy

          Fred Godwin replied, "No thanks, I can do that by myself"

        2. Silverburn
          Coffee/keyboard

          Re: Sounds like they need...Clippy

          See icon! <--

    2. Anonymous Coward
      Anonymous Coward

      Re: Sounds like they need...

      "more "Are you sure" dialog boxes."

      That's been proven not to work.

      It makes it worse: People just gun through them even faster.

  9. jaycee331

    !=Outsourcing best practice

    No better example of failing to consider the golden rule "do not outsource a function that is critical to your core business". Like a banks mainframe perhaps...

  10. Just_this_guy
    Facepalm

    Goodness gracious me.

    I wouldn't like to be in the shoes of the 'inexperienced employee' right now...

    (Still, at least he's got that £11k salary to console him...)

    1. Anonymous Coward
      Anonymous Coward

      Re: Goodness gracious me.

      It was thanks to his salary that consoled him.

  11. Anonymous Coward
    Anonymous Coward

    Hmm...

    Beware of qualified answers: The staff in India are RBS employees, not outsourced, so technically this isn't an outsourcing issue, it's an inexperienced cheap staff issue.

    1. Anonymous Coward
      Anonymous Coward

      Re: Hmm...

      I think many people are conflating two issues because they are so often interlinked, and that's offshoring and outsourcing. I can't speak from any experience of RBS back office, but although we know that RBS do do captive offshoring (ie own employees), they also make extensive use of outsourced deals, both on and offshore:

      http://articles.economictimes.indiatimes.com/2009-10-07/news/27646604_1_rbs-chief-executive-royal-bank-technology

      Where exactly the blame for this current fuck up truely lies I don't know, but the finger seems to point very strongly offshore. As soon as management start believing spotty faced expensive management consultants, they fall in love with the idea that every job (apart from their own) is a globally portable commodity, and we end up with these types of disaster.

      And in large part, the "low wage" is a big part of the problem. If the offshore staff are any good they'll move on to better paid jobs (ie no experienced staff in your offshore centre), or they demand regular and generous pay rises (ie wave goodbye to the savings that McKinsey/BCG/Accenture/KPMG or whoever promised you). If they're really good and motivated, then they'll build their skills and upsticks to another country that pays better, like the US or Western Europe (a perennial problem iin eastern Europe is retaining BPO or IT skills when people can just move across EU borders and quadruple their wages).

      There is a simple answer to all of this. Stop offshoring jobs in pursuit of modest and fast eroding savings. Keep them onshore, automate where practicable and sensible, and refine processes and systems to keep costs under strong control. When it comes to outsourcing, that can make sense if you honestly believe that your supplier has some magic sauce that enables them to do the job cheaper and better than you can - in which case you need to ask why they are better than you are at employing people and telling them how to do what is often a transactional or near manual job.

      1. Anonymous Coward
        Anonymous Coward

        Re: Hmm...

        It's not just about "offshore". I worked in an IT department that had a team in India, employed by the company and it worked great.

        What consultancies have always done, right back to the early 90s (and maybe before), is to nickel-and-dime customers, even with staff in the UK. They'll start with good people during the implementation process, then once everything's settled in and running smoothly, and it's a bugger to change things back, you start to see experienced guys move off and you find you've got someone new at the other end of the line who doesn't have a clue. But, you're still paying the same for some college kids as for an experienced guy.

        They're also horrible places to work for people who care about doing a good job. You'll rarely meet good managers or execs, or work with brilliant people. I've worked for 2 consultancies on contract, and both had people in charge who were clueless.

      2. Nigel 11
        Mushroom

        Re: Hmm...

        Spot on. It's nothing to do with offshore staff per se. It's to do with replacing long-term staff with proven experience, by cheaper staff with no experience. Staff who quite possibly lied on their CV to get a job, or paid someone else to sit their exam.

        It could be worse. I wonder if they're offshoring the control rooms for nuke power stations yet?

  12. tath
    Facepalm

    Breaking news...

    I'm sure this latest development is shortly to be "exclusively revealed" by the Guardian's investigative team, like yesterday's revelations were, several hours after I'd read them on here.

    1. Anonymous Coward
      Anonymous Coward

      Re: Breaking news...

      While the Guardian seems to have passed this off as its own research, looks like the Daily Mail not only picked up and ran with the offshoring aspect, but also credited El Reg as the source. As a long time grauniadista I'm disappointed in them, but pleased a least one part of the national press got it right.

      Offshoring can be a false economy and in this case the taxpayer pays the price :(

      [anon as I work for a company that is tangentially involved]

      1. Anonymous Coward
        Anonymous Coward

        Re: Breaking news...

        I feel dirty for reading something quoted by the Fail as a source...

    2. QuiteEvilGraham
      Stop

      Re: Breaking news...

      I can assure you that the Guardian did do its own research. And anything which gets this to a wider audience is OK by me.

      1. Goldmember

        Re: Breaking news...

        "I can assure you that the Guardian did do its own research"

        Yes, they researched El Reg.

        1. Destroy All Monsters Silver badge
          Trollface

          Re: Breaking news...

          "I can assure you that the Guardian did do its own research"

          But did they revela the admin password for the RBS mainframe in a random chapter heading because it's ok to disclose it after the fact?

          1. QuiteEvilGraham
            Happy

            Re: Breaking news...

            Hah! - nice one.

  13. Brent Longborough
    Unhappy

    That's it, blame the help

    So this software doesn't even ask "Are you sure you want to erase the whole effing show?"?

    Where, oh where, is the lost art of the user interface (even if it's a command line)?

    1. John Burton

      Re: That's it, blame the help

      No amount of asking for confirmation helps if you think that the operation *is* actually what you need to do...

      1. Peter2 Silver badge

        Re: That's it, blame the help

        I prefer the good old CLI, and especially the ones without help entries.

        I don't think I have ever seen someone who didn't know what they were doing messing one of those systems up, simply because it's virtually impossible to use without having been properly trained in the first place.

  14. Anonymous Coward 101
    WTF?

    Astonishing

    - Are junior employees regularly being put in the position of being able to wreak such havoc in the first place?

    - Were no fail safes being used?

    - Was there no supervision?

    1. Rufus McDufus

      Re: Astonishing

      That a single inexperienced person could delete a queue-worth of batch jobs with such importance to the company is just frightening. Just listening to the technology being used here makes me feel like I've gone back to the 80's.

    2. Anonymous Coward
      Anonymous Coward

      Re: Astonishing

      Why are you assuming it was a junior employee. It might have been the head of IT RBS India. I think the word you are looking for is inexperienced.

    3. Anonymous Coward
      Anonymous Coward

      Re: Astonishing

      "Was there no supervision?"

      If they were following process then there would have been ample supervision. RBS is very ITIL focused, and the instant the decision was taken to backout the upgrade there would have been an incident team assembled. That involves Incident Managers authorising every action being taken, which usually follows much debate on a conference call about all possible courses of action.

      It is possible the inexperienced employee made a mistake attempting the task, but you can't directly prevent that type of error. Or it could be that the Incident Manager was similarly inexperienced.

      1. Anonymous Coward
        Anonymous Coward

        Re: Astonishing @AC 13:14

        >RBS is very ITIL focused

        As are all companies right up until they get the certification then it all goes out the window. Oh, and ITIL and all that best practices voodoo is a pile of crap anyway.

        1. slideruler

          Re: Astonishing @AC 13:14

          Not quite. ITIL done out of the book, is a pile of crap. A recipe for red tape disaster. But applied selectively, in a common sense way, it can work well.

          Certainly if they'd done Change and Release properly - as the book suggests, (i.e. tested the change in a pre-production environment, tested and documented the backout, specified appropriate post-implementation success/fail criteria) they'd probably have been in a better state than they ultimately got into.

          I'd *love* to read their major incident report though... :-)

        2. FreeTard
          FAIL

          Re: Astonishing @AC 13:14

          If they had of followed very basic project management processes, they would have had an immediate fallback. This is very basic stuff, normally I wouldn't care but all my banking is with natwest, as is the wife's, which mean she didn't get paid, and neither did I. Let's see if they take my mortgage out tomorrow, as its with them as well, as is my credit card.

      2. Anonymous Coward
        Anonymous Coward

        Re: Astonishing @AC 13:14

        Been there with the joke they call RBS Incident Management. More like hours of procrastinating until some one senior enough can be found to say yes to the solution offered up at the very start by the techy.

        Been there, suffered it, dozed through most of it as it was the usual way to survive.

        Still it must be remembered they did off-shore roughly 60-70% of all technical staff, but had to delay redundencies on a number of occasions as they could not get enough 'bums on seats' in India for a long time, and een then only after preventing managers from rejecting the crud that was being offered.

      3. Anonymous Coward
        Anonymous Coward

        Re: Astonishing

        ITIL and Incident Managers just smacks of middle-management bullshit. Sack of crap ITIL didn't exactly help them here did it? Seriously, that is one giant bag of bollocks.

      4. Bilby

        Re: Astonishing

        >RBS is very ITIL focused

        Ahh, you have hit the nail on the head. From Wikipedia:

        "ITIL describes procedures, tasks and checklists that are not organization-specific, used by an organization for establishing a minimum level of competency."

        Managers who think that achieving a "minimum level of competency" is sufficient should not be allowed to play with systems that require an above-average level of competency; and managers who think that their specific complex organization is best operated by using "procedures, tasks and checklists that are not organization-specific" should not be allowed to run a lemonade stand.

        The fundamental problem is that there are people in positions of power who think that 'management' is a basic skill in and of itself, which can be applied successfully to running anything, from lemonade stands to multinational banking houses.

        The "Lack of experience" problem started at the top, and the feeling in boardrooms around the world today is "We don't need detailed knowledge of our corporation's systems, so why should we pay through the nose for staff who do? Let's write down the instructions on a checklist and give them to someone in a poverty-stricken hell-hole like Hyderabad, Mumbai or Edinburgh, who will step through them for $17,000 pa and no benefits."

        What could possibly go wrong?

    4. Anonymous Coward
      Anonymous Coward

      Re: Astonishing

      "- Are junior employees regularly being put in the position of being able to wreak such havoc in the first place?

      - Were no fail safes being used?

      - Was there no supervision?"

      Do you not work in IT, then?

      Sysadmins with only a few years of experience are routinely at the helm of command lines which can trash FTSE100 critical systems. Companies simply do not hire elite teams of 60kpa+ admins, and then hire more 60k+ admins to watch over their shoulder as they type every command, every day.

    5. Nigel 11
      Mushroom

      Re: Astonishing

      I suspect it's a multi-level FUBAR. Someone made a small not very serious error. Someone else got the patch-up for that wrong, and made the hole bigger until a chunk of masonry fell into it. And then someone carried on digging even though he *really* should have stopped, and brought the entire building down. "When you're in a hole, stop digging" is good advice, but these guys seem not to have known a hole when they saw one.

      The person who really needs to be shot isn't any of the sods on the ground. It's the person who decided it was OK to get rid of all the experienced staff in the first place. Preferably also everyone upwards from him to the CEO, since it was mission-critical, and to encourage the others. Before we get to find out how much worse it might have been, by experiencing it.

    6. slideruler
      FAIL

      Re: Astonishing

      You have to understand something about mainframes, and the history of RBS/Natwest et al.

      The golden rule of mainframes, from the day they first appeared, is that they are *not* idiot proof. They're unfriendly, ruthless, highly reliable, highly efficient data processing engines. Issue a shutdown or force command with master console authority, and it will do it, no questions asked. Its assumed that an idiot wouldn't be given authority to do something stupid. The same assumption tends to apply to people who have high levels of authority within its various subsystems, like Netview, CA7, RACF etc. You can't blame the technology.

      Historically (ten years or so ago) RBS were the most conservative of banks. They were risk averse within IT, and had a high body count of experienced people in both Edinburgh and London who generally knew their jobs well, and the merged clearing systems worked reliably and efficiently.

      Fred the shred then trashed the bank with his dutch gamble, and Hester arrives. He says to his general "cut costs at all costs". They start swinging the axe on all UK based techies. Nearly all techie level jobs are to go overseas, or to UK based Indian staff on ICT visas. I had a colleague working there. He was told to hand over to Indian staff - and they started interviewing for his 'replacement'. Many who turned up for the phone interview had difficulty in stringing two words together. Others were plainly clueless; pure CV creativity worthy of the booker prize. He eventually found three (yes, three indians to replace one UK based staff member) and spent weeks explaining his job, and processes to them. He documented, powerpointed, and PDF'd to excess - so that any reasonable techie with the skills as advertised could have picked the job up. He then left. Within three days, he got an email at home. They'd managed to stuff things up - and would he help.... The three replacements soon became two, as one left to go elsewhere for more money...

      I'm rather enjoying seeing the fruits of Hesters slash and burn decision. Please, please,please lets see him in front of the house of commons select committee, being grilled on what he's done - and why it went wrong. If they want evidence, I'm sure there are plenty of ex RBS guys who can attest to what's happened.

      I really don't trust any of the banks today - with my personal financial data, or my money. I've got another colleague - now made redundant from Barclays, due to 'cost effective global sourcing'. The problem is, I see few other UK owned and operated banks left.

      Mattress stuffing really does seem to be the only option left.

  15. Anonymous Coward
    Happy

    Where's Julian Assange when you need him?

    1. Androgynous Cupboard Silver badge

      Writing a press release of course

      About Julian Assange. Of course.

    2. Anonymous Coward
      Anonymous Coward

      Evading justice, like the scum-bucket he is?

    3. Anonymous Coward
      Anonymous Coward

      "Where's Julian Assange when you need him?"

      Ecuador.

  16. Scott 57
    WTF?

    Please excuse my ignorance....

    But the queue wasn't backed up before a change?

    And why the hell was an inexperienced op working on this?!

    Cheerio RBS - been nice banking with you....

    1. Nigel 11
      Unhappy

      Cheerio RBS

      My first thought.

      My second thought was whether things are any better elsewhere?

      My third thought is whether after a few months, they won't have learned something from the experience and done at least some of what's necessary to make sure that the lightning strikes somewhere else next time.

  17. Mr_Bungle
    Mushroom

    Work Blunders

    I'm sure most of us have made errors on a rare occasions.

    I once had missed a nights sleep and clicked 'shut down' rather 'log off' when remoting into a server. A file server, with people working on open documents.

    I replied to an email rather than forwarded it. The e-mail was slagging off a customer who was a giant bellend and this and some other thoughts were sent back to him. Got away with this somehow.

    And so on.

    Imagine the bowel emptying horror when this chap realised what he had done.

    1. Anonymous Coward
      Anonymous Coward

      Re: Work Blunders

      "Imagine the bowel emptying horror when this chap realised what he had done."

      You own me a keyboard!

      I did a remote shut down by accident when I was on Holiday in France at 0300 on a Sunday, boss had missplaced the spare key to the server room. They had to break the door down to hit the power button.

      Yes, 0.5 seconds are pressing that button the bottom fell from my world!

      Good job my boss is a grand fellow and has a large sense of humour.

      1. Rufus McDufus

        Re: Work Blunders

        I remember remote-logging into a telco's main DB server 20 years ago via modem (when I worked at CA, funnily enough). I wasn't getting any response on the console so tried a few of the typical key presses to try and get a response. Lo and behold a bunch of reboot-style messages appear in front of me. They'd given me remote access straight to the console. I never did know if my attempts to get some terminal activity caused it or not.

      2. Anonymous Coward
        Stop

        Did It Ever Occur To You

        ..that the real idiot here was your boss ?? His Key was a kind of Single-Point Of Failure.

        A faulty action is not an issue at all. An Issue is that a single faulty action can screw up a complete operation.

        Many places are run like this and the leadership (or lack thereof) is to blame for it.

    2. Anonymous Coward
      Anonymous Coward

      Re: Work Blunders

      I was putting together a script for an evening outage while on a conference call to the customer. Copying the commands to fail a cluster to the B-node, I accidently pasted them into a putty session logged in as root rather than my text editor.

      Following the rule "it's always better for the customer to find out from you first", I had to sheepishly admit not only that I'd taken their site down for 5 minutes but also that they didn't have my full attention on the call. Fortunately I'd dug them out of the brown stuff often enough that my balance at the Bank of Goodwill easily coverered it.

    3. NogginTheNog
      Coat

      Re: Work Blunders

      Ooops! I think there may be a position open for you with RBS from next week... :-D

    4. KjetilS

      Re: Work Blunders

      ... and I once hit the breakers marked "UPS" thinking it was the breakers *to* the UPS. In reality, it was the ones going *from* the UPS. I realised my mistake when the whole room suddenly went dark and silent.

      That resulted in a two hour downtime for the whole company, and my boss laughing his ass off.

      His comment? "Atleast you won't do that again soon"

      The breakers have better markings now.

      1. Tom 38 Silver badge

        Re: Work Blunders

        One of my colleagues (now my manager \o/) wanted to kill a recently backgrounded job on the only production server hosting our website.

        He meant to type:

        kill -9 %1

        He typed

        kill -9 1

        Thus killing init, putting the box into a dead state, and the website offline until we could get an techie into the DC to press the reset button.

        After this, all servers get DRAC consoles, he got his root access taken away, and we got backup servers.

        1. Anonymous Coward
          Anonymous Coward

          Re: Work Blunders

          As somebody who was called in to several "managed incidents" over the years, one of the first things to do was to fire off an email to the operators mailbox asking who was the shift manager on duty.

          Unfortunately the spell checker in outlook does not pick up on the missing "f" in "shift".

          Never ceased to amaze me the number of replies I would get.

        2. Anonymous Coward
          Anonymous Coward

          Re: Work Blunders

          "kill -9 %1

          He typed

          kill -9 1"

          Ah... the glorious kill -9.

          I've sat there and played 'rock paper scissors' with other admins to see who *didn't* get to be the one to send a questionable kill -9 on HA servers before.

        3. tfewster Silver badge
          Facepalm

          Re: Work Blunders

          > he got his root access taken away...

          But he'd just gained valuable experience! That gut feeling that makes you pause before hitting return because something isn't quite right..

          OTOH, someone who makes the same mistake twice deserves no mercy

    5. keithpeter
      Windows

      Re: Work Blunders

      "Imagine the bowel emptying horror when this chap realised what he had done."

      I suspect this must have been a slowly dawning realisation given the description of an 'incident team' above and the procedures that should be in use. Perhaps the full scope was not apparent for hours/days. Lovely. Gives me chest pains just thinking about it.

    6. Dr. Mouse Silver badge

      Re: Work Blunders

      "Imagine the bowel emptying horror when this chap realised what he had done."

      I can't think of a better way to describe it. Well done :)

      Yeah, I've committed serious errors in the past, although none on even the scale of yours Mr_Bungle. My most recent was when I was clearing up some log files, and then tried to restart syslog without checking the command line was clear. "rm /etc/init.d/syslog restart", followed by confusion over the error of "no such file: restart", followed by a panicked search for a backup.

      Anyone can make a cockup when under pressure or not concentrating properly on a menial task. I hate to think of the panick the guy who did this went through. Bowel emptying indeed!

      1. Wensleydale Cheese
        Happy

        Re: Work Blunders

        @Dr. Mouse

        "Anyone can make a cockup when under pressure or not concentrating properly on a menial task. I hate to think of the panick the guy who did this went through. Bowel emptying indeed"

        I was once called out by a panicking operator when disk space was running dangerously low, with only 45 minutes to go before the night shift started.

        By the time I arrived, word had got out and anxious line managers well arriving in droves.

        "Can we have a meeting about this?"

        My response was to barricade myself in the computer centre and ask the operators to keep all the managers out.

        Fortunately they had the power to do that and I could concentrate on the problem in peace and quiet.

    7. Anonymous Coward
      Anonymous Coward

      Re: Work Blunders

      You chaps so funny!!!

      I be having a work blunder of my own just last Tuesday! Get phone call from big boss in Scotlandland saying to please be re-running program that makes money for bank. Silly thing go crash and say bad things so I switch PC off and on again and try to run program again. Same thing!!! So I try again and silly thing just say more bad things. So I switch PC off and go home.

      Anyone for to be needing expert in Microsoft Word and CA7?

      1. MrZoolook
        Thumb Up

        Re: Work Blunders

        Comments like this are why I sometimes want to create extra accounts, so I can upvote again!

    8. Anonymous Coward
      Anonymous Coward

      Re: Work Blunders

      Oh yes indeed, like the time I reached down to reboot the four servers I was building (after configuring the RAID enclosure), pressed the power buttons with two fingers of each hand and realised that my servers were in the cabinet next door to the one with the console in and that the moment I released the pressure I was going to perform a decidedly inelegant shutdown on two SQL clusters running the backend of a customer facing webapp....

      It was a lonely few moments followed by a fatalistic shrug, a quick powerup and the rest of the day keeping a very close eye on the incident queues in case I might have to put my hand up and admit my error.

      Or the day (at another bank) that I got a snotty email asking why I had blasted a VM running the print spooler for mortgage applications resulting in no letters having been printed for 11 days ( the answer beingthat it was a java component, called on demand, that was undocumented on that machine - indeed documented as on a different machine - that had been moved there in response to an incident months before and that no-one had bothered to remediate).

      Or indeed any of the other stupid but easily commited errors that happen all the time - even to experienced and competent people.

      This stuff happens all the time, just not usually so badly to such critical and highly visible systems.

      1. Rufus McDufus

        Re: Work Blunders

        "Re: Work Blunders

        Oh yes indeed, like the time I reached down to reboot the four servers I was building (after configuring the RAID enclosure), pressed the power buttons with two fingers of each hand and realised that my servers were in the cabinet next door to the one with the console in and that the moment I released the pressure I was going to perform a decidedly inelegant shutdown on two SQL clusters running the backend of a customer facing webapp...."

        Ha ha! I did that on a mail server. Pressed the button on the wrong box. I was looking around for a broom or pole to stick in the power button to replace my finger. Then my boss walked in and laughed at me!

        1. Anonymous Coward
          Anonymous Coward

          Re: Work Blunders

          Thankfully not me, but I was warned about it when I started as an admin by the person who did. We have a high availability system that underlies pretty much everything. 2 parallel systems running in different datacentres, with work load balanced between the two. To do any updates or changes, you route traffic from A to B and stop A to update it. This guy routed all traffic from A to B and stopped B.

          I have accidentally made a mistake with the mqsistop command though, what sort of idiot names a message broker and a config manager the same with just one letter different :)

  18. Bob Terwilliger
    FAIL

    Smell the management bull..

    Typical management/politician type answer that tells you more by it's omissions than what it contains.

    "Well I have no evidence of that." <translation> I have been told by underlings but haven't seen the documentation myself

    "The IT centre - our main centre, we’re standing outside here in Edinburgh, [is] nothing to do with overseas. " <translation>The mainframe is here, I'm not saying where the sys admins are

    Our UK backbone has seen substantial investment." <translation> We have upgraded hardware and software, I'm not saying where the sys admins are.

    1. Anonymous Coward
      Anonymous Coward

      Re: Smell the management bull..

      it looked like he was standing outside Fettes Row - Group Tech HQ just north of Edinburgh city centre - whilst there is a small dev-only data centre in there; the main data centres are to the east and south of Edinburgh and are pretty big. i.e. the big rooms are about the size of a football field. The LPARs that run all the batch jobs are actually quite unimpressive when you stand next to them.

      When I was last in them (some years ago) - they were really nicely done (compared to some horror machine rooms I've seen) and very well organised.

      I think it's fair to say that people who worked for RBS pre-problems would say it is amazing place to work and highly professional.

      1. Anonymous Coward
        Anonymous Coward

        Re: Smell the management bull..

        WAS a great place..

        There fixed it for you.

        Quality has degraded significantly over the last ten years.

  19. Evan Essence
    Stop

    Problematic updates are normal?

    "... the relatively routine task of backing out of an upgrade to the CA-7 tool. It is normal to find that a software update has caused a problem; IT staff expect to back out in such cases."

    Really?

    1. Anonymous Coward
      Anonymous Coward

      Re: Problematic updates are normal?

      '"... the relatively routine task of backing out of an upgrade to the CA-7 tool. It is normal to find that a software update has caused a problem; IT staff expect to back out in such cases."

      'Really?'

      Yes, really - in the dysfunctional screwed-up world of British banking, where immensely over-rewarded people completely devoid of banking qualifications continually seek to double their money by putting together mergers, takeovers, and ambitious new ventures. Without for a moment considering the impact on the resilience and maintainability of the computers down in the engine room without which the whole bank would instantly fall apart.

      1. Anonymous Coward
        Anonymous Coward

        Re: Problematic updates are normal?

        Question: Who is the odd one out from the following list?

        Lord Stevenson, former chairman, HBOS

        Andy Hornby, former chief executive, HBOS

        Sir Fred Goodwin, former chief executive, RBS

        Sir Tom McKillop, former chairman, RBS

        John McFall MP, chairman of Treasury select committee

        Alister Darling, Chancellor of the Exchequer

        Sir Terry Wogan, presenter of Radio 2 breakfast show

        Answer: Sir Terry Wogan. He is the only one with a banking qualification.

        Acknowledgement: Private Eye

        1. Anonymous Coward
          Anonymous Coward

          @Tom

          I am so stealing that!

          1. Robert Carnegie Silver badge

            Correction?

            http://en.wikipedia.org/wiki/George_Osborne is a Recommended Update. Not as Chancellor, but in the Catalogue of Non-Competence.

            "Osborne's first job was entering the names of people who had died in London into a National Health Service computer. He also briefly worked for Selfridges, re-folding towels. He originally intended to pursue a career in journalism, but instead got a job at Conservative Central Office."

            Does anyone remember some badly folded towels in Selfridges in the early 1990s? The famous Bathgate scandal?

          2. Anonymous Coward
            Anonymous Coward

            Re: @Tom

            "I am so stealing that!"

            As was I... If you are going to steal, why not steal from the best (Private Eye)?

        2. JimmyPage Silver badge
          Stop

          Re: Problematic updates are normal?

          I was channel hopping a few weeks ago, and hit UK Gold and an episode of Yes Minister, where a bank CEO wanted a few more storeys on his HQ building and had to go to the Department of Administrative Affairs to discuss it.

          As he meets Sir Humphy, the "joke" is that the CEO of the bank hasn't the first clue about banking, and that Sir Humphy is lined up for a nice non-executive directorship with the bank when he retires.

          Can you guess when that episode was written:

          a) 2011

          b) 2001

          c) 1991

          d) 1981

          (Answer: d)

        3. Anonymous Coward
          Anonymous Coward

          Re: Problematic updates are normal?

          It is very stealable (and is already on my FB page) - but also slightly out of date; John McFall became Baron McFall of Alcluith in 2010, and is therefore the former Chairman of the TSC (2001 to 2010). I have no idea whether Andrew Tyrie, the current Chairman, has a banking qualification.

        4. Anonymous Coward
          Anonymous Coward

          Re: Problematic updates are normal?

          God, I'd forgotten Terry used to work in a bank as a youngster, as in actually "do bank work" lol

          I think banks were respectable back then though.

    2. Dave 126 Silver badge

      Re: Problematic updates are normal?

      >"... the relatively routine task of backing out of an upgrade to the CA-7 tool. It is normal to find that a software update has caused a problem; IT staff expect to back out in such cases."

      Seems reasonable. To expect an upgrade to one system that is interlinked with other strange old systems to go absolutely perfectly every time is naive; to have a mechanism to undo it or cancel it safely seems sensible. However in this case it seems that this procedure was either not idiot proof enough or the operator was having a bad day or a bit of both,

    3. Peter Gathercole Silver badge
      Meh

      Re: Problematic updates are normal?

      Don't know about RBS, but I've worked in other places in Banking, Government agencies and the Utility Sector.

      Most large organizations will not authorize a change unless there is a fully specified back-out plan, together with evidence that the change to the live system has been tested somewhere safe first.

      In some places I've been, the risk managers have wanted a "how to recover the service should the back-out plan fail" plan.

      The RBS example is evidence of exactly why you have this level of paranoia, and why you spend more time writing up the change than the change itself takes, and why you sit in Change Boards convincing everybody that the change is safe.

      Unfortunately, I'm sure that may of us here have complained about how much the process costs, how much time is wasted, and how quickly you could work if you didn't have this level of change control. I learned my lesson the hard way many years ago, and now follow whatever the processes are without complaining.

      Maybe the higher management will learn some lessons from this as well. But I somehow doubt it.

      1. Anonymous Coward
        Anonymous Coward

        @Peter Gathercole

        I take it your a change manager justifying your job? heres what I got on the end of my fortnightly change management e-mail.

        Quote of the Month

        “No Change is without risk. Changes are managed to minimise the potential negative or unpredicted Impact and Risks of Changes on existing Services and to benefit both ???? and the Customer – ensuring the alignment of ???? IS to Business requirements and a standard approach is used to maintain the required balance between the need for Change and the Impact of Change” - Extract from the “exciting” new ???? Change Management Procedure V2 – published next month.

        The word an@l and change manager seem to fit nicely.

        1. Anonymous Coward
          Anonymous Coward

          Re: @Peter Gathercole

          And I'm guessing you're the type of techie who thinks he knows everything right up until the point he deletes the entire batch schedule when trying to back out a change.

          Process != being anal, unless you're doing it wrong.

        2. Anonymous Coward
          Anonymous Coward

          Re: @Peter Gathercole

          "The word an@l and change manager seem to fit nicely."

          So you've taken unjustified offence at some standard management bollocks appended to an email about change management. That doesn't alter the fact that well run projects often feature damn good project managers and change managers. I'm good at what I do, but that's not every detail of managing complex organisational or systems change, and I'm not arrogant enough to presume that on big complex projects I know it all (even as project manager on some of these). Luckily I have other people around me who reduce the risks of my carelessness, oversight or lack of time through their diligence, involvement in detail, and application of procedure.

          But obviously you know it all, so why hide behind AC?

      2. This post has been deleted by its author

      3. Anonymous Coward
        Anonymous Coward

        Re: Problematic updates are normal?

        Having worked in an RBS company, RBS have that level of paranoia, and it was a complete PITA to do any software releases to a live system.

        So this raises the questions, did RBS GT not follow their own procedures?, or given the amount of hassle that's involved, did they try and short-cut the process? Or perhaps it all became just a form filling/box ticking exercise? I have experience of the latter...

      4. Anonymous Coward
        Anonymous Coward

        Re: Problematic updates are normal?

        RBS are (or still are I believe from my colleagues since I was off-shored) very much in to change management, as well asa any other red tape that can be put in the way of a techy doing their job. As to implementation plans and back out plans - yup they like those as well, and while wordy/complex are actually well laid out complete with back out stages and back out plans.

        Of course with most the senior techies gone, along with most the other UK techies, the quality of those who write them, and proof read them, may have gone down considerably.

      5. Anonymous Coward
        Anonymous Coward

        Re: Problematic updates are normal?

        how often are those back-out plans and the how to recover the service should the back-out plan fail plans actually tested before the change takes place?

        pretty much never because they're impossible to test for.

        Generally they're just finger in the air style guesses. Yet they're still good enough to give change controllers a warm fuzzy feeling...

    4. Wensleydale Cheese
      Unhappy

      Re: Problematic updates are normal?

      @Evan Essence

      "Really?"

      Yes. We are talking about CA here (and I have the scars), but the same principle should apply to all third party software. Even top quality products can break in your environment.

      What nobody seems to have done yet is ask whether that CA software update was tested first in a non-production environment.

      A proper test environment does not mean a machine with the bare essentials on it. You need to have a test environment which reflects the other products installed, naming conventions, data volumes, and in this case the number of jobs, that the production environment has.

      1. Evan Essence
        Stop

        Re: Problematic updates are normal?

        Most commenters to my comment seem to be missing the point. Yes, updates can fail, and yes, you have change control procedures in place. But look at the article again. Are problems *normal*? I would say that means, by definition, that *at least* 50% of updates fail. Really?

  20. Anonymous Coward
    Anonymous Coward

    Pressing the button

    Imagine the moment when one man's boney finger was right there waggling over the 'enter' key as a message saying "are you sure you want to disrupt 16.9 million accounts" came up on the screen. Only out-classed by the moment when he pressed it.

    1. Destroy All Monsters Silver badge
      Terminator

      Pressing buttons is serious business!

      Unfortunately, you would need Wintermute running the system to get that kind of error message.

  21. Anonymous Coward
    Anonymous Coward

    No evidence

    "The CEO of RBS Group, Stephen Hester, has said that there is no evidence that the problem is connected to lack of investment in technology at RBS and the outsourcing of IT jobs to India".

    And there won't be any evidence if he has anything to do with it.

  22. Anonymous Coward
    Anonymous Coward

    Don't believe everything you read

    I have seen the incident record from when this started (17/6) and it isn't an Indian name on the ticket for the backout procedure (not until the job got handed over at any rate).

    An upgrade from v11.1 to v11.3 of the CA7 software went wrong though, that much is clear.

    1. Anonymous Coward
      Anonymous Coward

      Re: Don't believe everything you read

      Can you let everyone know the IR number then. Also it would be worth sending the information anon to a news organisation. People need to know the truth after all.

      Unless you come from the RBS PR department of course.

    2. Steve Davies 3 Silver badge

      Re: Don't believe everything you read

      Not an Indian name in sight on the ticket?

      So you have never had those phone calls from John, Mary, Peter and the like who by the accent of their voice have to be from somewhere like Mumbai, Bangalore, Kolkata or even Delhi.

      Surely some them will have migrated to RBS Support by now.

      (Strictly tongue in cheek naturally)

      1. Anonymous Coward
        Anonymous Coward

        Re: Don't believe everything you read

        Considering the high profile nature of this incident I don't think I should reveal anything more, and I didn't say that there wasn't an Indian name in sight on the ticket, just not straight away when the backout went awry. Doesn't necessarily mean they weren't involved of course, but the only people who know 100% for sure are the people directly involved..i.e. the ticket doesn't indicate that anyone in India was past of the initial incident reponse - so beware of people saying they 'know'.

        1. Anonymous Coward
          Anonymous Coward

          Re: Don't believe everything you read

          AC 14:13 = Stephen Hester

          Hoping the DM or Telegraph are still readin El Reg comments.

    3. Anonymous Coward
      Anonymous Coward

      Re: Don't believe everything you read

      im sure I'm not the only one who would love to see all the details and updates on that incident! someones gotta leak it surely?!

    4. nsld
      Paris Hilton

      Re: Don't believe everything you read

      I got a call once from "Will Smith" who was clearly in an Indian call centre trying to flog me something.

      I sympathised with him that Independence day was a shit awful film but hadnt realised his career had tanked so badly.

      I seem to recall in Young Frankenstien they got the brain from Abby Normal, was she the name on the ticket?

    5. This post has been deleted by its author

      1. Anonymous Coward
        Anonymous Coward

        Re: Don't believe everything you read

        O'rly?

        I too have seen the change and know why it was raised. But yes there is no mention of who formatted what que only rhe rather english nmae of who discovered that.

      2. Anonymous Coward
        Anonymous Coward

        Re: Don't believe everything you read

        O'rly.

        I too have seen the change record and why it was raised. I have also seen the incident . There is no mention of who formatted something. So yeah no one is responsible cos its not in the incident.

        1. Anonymous Coward
          Anonymous Coward

          Re: Don't believe everything you read

          It's irrelevant who started what and when, nor who and how they initiated a rollback, nor where the servers where. Inexperienced outsourced staff were supervising the batch jobs. It would, or should, have been their job to raise the flag instead it seems they happily watched over a disaster happening until it was too late. Experienced staff would have been more on the ball and realised something was wrong earlier. Yes, things do go wrong, it's inevitable, however it's how they are handled that matters.

          The questions to be answered should be when did the cock-up start , how long was it before someone raised an alarm and who was supervising the process when it went tits up. The latter seems to be pretty clear.

          So to those who've seen the incident reports maybe you could answer those questions instead of trying to create a smoke screen to protect your beloved leader.

  23. Anonymous Coward
    Anonymous Coward

    Am I the only one?

    I have a NatWest account and have not experienced any issues at all. Had a couple of PayPal deposits come in and taken some cash out, everything been normal the last week.

  24. Christoph Silver badge

    Only one thing they can do

    Only one way they can deal with the bank executives who got rid of all the experienced IT staff.

    Increased bonuses all round!

  25. John G Imrie
    Facepalm

    A serious error committed by an "inexperienced operative"

    s/an inexperienced/our most experienced/

    Especially as commentators have pointed out that RBS has sacked, sorry I meant performed an involuntary reverse strategic asset increase, of the most experienced members of it's workforce.

  26. roadsidepicnic
    Holmes

    "We offered the company an opportunity to confirm that the critical blunder was committed by a UK-based rather than an India-based operator.

    However the bank's spokesmen refused to offer any further comment."

    None needed, their refusal to comment provides the answer.

  27. Nash_Equilibrium
    Thumb Up

    Mainframe Madness.

    Ha, CA are still promoting May Mainframe Madness. "Extended by Popular Demand for a Limited Time Only!"

    http://www.ca.com/us/lpg/May-Mainframe-Madness/May-Mainframe-Madness-2012.aspx

  28. Anonymous Coward
    Anonymous Coward

    It's hard to work out how they have messed this up so badly. If a remember correctly when installing CA-7 you set an option on whether it keeps everything or initialises from new. When backing out the software update they wouldn't restore from a backup but reinstall the previous version and maybe they messed this bit up? This part was most likely done in the UK and in India they manage the batch schedules. If this was left unnoticed for a few days then they have a major issue. It will be a spider web of feeds and dependencies.

  29. frankfrankerton
    WTF?

    LOL again this is pure speculation, I hope you arent paying your "source" much money because this is totally fabricated.

    1. Anonymous Coward
      Anonymous Coward

      Damn straight

      Frank Frankerton is the voice of authority in this matter! Why, only this morning he informed me it was the work of Israeli saboteurs. Or was it Reticulan saboteurs. Anyway, the important thing is he's on the internet and he knows The Truth!

  30. Anonymous Coward
    Anonymous Coward

    I wonder

    If the investigation subsequently discovers the root cause was inexperienced offshore/outsourced staff and the monetary loss was equal to or greater than the offshore/outsource savings then will the bank consider bring the functions back onshore / in-house?

    Just saying...

    1. Fatman Silver badge
      FAIL

      Re: I wonder

      "...then will the bank consider bring the functions back onshore / in-house?"

      NOT A CHANCE IN HELL - Manglement will not give up its bonu$e$.

      1. tfewster Silver badge

        Re: Re: I wonder

        Year 1: We outsourced our service to save money! Bonuses all round!

        Year 2: We insourced our service to improve service! Bonuses all round!

        Repeat.

        Funny how my spellchecker recognises "outsourced" but not "insourced"

  31. Anonymous Coward
    Anonymous Coward

    After a lifetime working with mainframes and distributed Windows and *nix servers...

    "A complicated legacy mainframe system .. " is seldom the problem. The applications are usually well implemented and optimised to the hilt. Administered by people who know the solution inside out and have developed their own tools to administer the systems. Expensive, though, and prone to loss of expertise through the aging out of the experts. None of the bright young kids want to work in that area.

    Treat the mainframe as just another distributed server ("until we can get rid of the expensive legacy stuff a bite at a time...") and you are looking for trouble. We switch from our last mainframes next month (also a banking service company) and I'm glad, but also a little sad after 40 years of dealing with their arcana at various employers.

  32. Marvin O'Gravel Balloon Face
    Headmaster

    CA added that RBS's technical issues were "highly unique to their environment".

    Can I put my pedant's hat on here and declare that "unique" is a binary term.

    1. Anonymous Coward
      Anonymous Coward

      Re: CA added that RBS's technical issues were "highly unique to their environment".

      Pfft. If you're not highly declaring a grammatical issue, then no-one is going to care.

    2. frank ly
      Headmaster

      Re: CA added that RBS's technical issues were "highly unique to their environment".

      I fully agree with your obvious point, but I think the correct word is 'absolute', not 'binary'. (Absolute: free from restriction/limitation/qualification)

      1. M. Poolman
        Headmaster

        Yes, but

        That is now a tautology. Something is unique or it is not, hence any qualifier must either be a contradiction or tautology, and so shouldn't be used.

        1. Anomynous Coward

          Re: Yes, but

          Disagree.

          Each installation of this software will be unique but if left in the fresh-from-the-box state they may only differ from each other trivially.

          If the uniqueness can be trivial it can also admit of other distinctions and if this instance is differs greatly from the possible trivially unique norm then there's an understandable sense to the phrase 'highly unique'.

          It might have been preferable to use the word 'particular' so it may be a stretching of the language but I do not think it breaks any useful rule.

    3. Anthony Cartmell
      Happy

      Re: CA added that RBS's technical issues were "highly unique to their environment".

      Not so: http://oxforddictionaries.com/definition/unique

  33. Purlieu

    You're new here, aren't you

    RE: Why, precisely, does one mess-up by one employee in front of one computer put your ENTIRE BANKING SYSTEM out of action, nationwide?

    Have you seen what happens on the M25 when a car breaks down in one lane ?

  34. JMB

    Heston was on the radio this morning going on about what high quality the staff are in India.

    I remember in the early days of outsourcing to India someone from BT said something similar, he was asked why they do not outsource management to India if the staff there are such high quality.

    1. Robert Carnegie Silver badge

      Hester shurely?

      Maybe you're thinking of Peston, the friendly face of financial disaster who works as a financial journalist at the BBC, or of Heston of suicidally reckless hotch-potch and self-combustible pudding fame.

  35. Anonymous Coward
    Anonymous Coward

    A copy of a very interesting CV

    Deleted today from Linkedin. Lucky there is a copy:

    http://cantankerous.co.uk/?p=747

    I wonder why RBS or Infosys wanted him to delete his CV in such a hurry?

    He's even in London now, able to work on 'UK-located' software...

    1. Anonymous Coward
      Anonymous Coward

      Re: A copy of a very interesting CV

      Very interesting

      I expect the grauniad may have an exclusive scoop,

      but at least the Daily Fail will give credit to AC 14.00GMT

      :-)

      1. This post has been deleted by its author

        1. Anonymous Coward
          Anonymous Coward

          Re: The Guardian is spineless

          Why? So somebody who for all you know had literally nothing to do with this mess gets their name muddied for the rest of their career?

          Piss off, your blog is shit and you're a mean-spirited vindictive cunt.

          1. cantankerousblogger
            FAIL

            Re: The Guardian is spineless

            Or, perhaps, another way to look at the CV is that it shows how mainframe batch support has been largely shifted to India and that outsourcing firm Infosys had a central role in that process, contrary to the public assurances of the CEO of a virtually-nationalised basket-case bank? Never mind the huge inconvenience and cost to millions of people as they were unable to get to their own money and the thousands of staff sacked in the UK. You're right though about this being about more than one technical person though, and that wasn't the point of what I posted, so I've deleted his name. You're welcome to your own opinion. Obviously I don't agree with you.

            1. Anonymous Coward
              Anonymous Coward

              A more relevent question

              A more relevent question to ask is who fired all the more experienced and mature staff and then outsourced 'Batched Processing` to someone with only four months experience with the system.

    2. Callum
      Unhappy

      Re: A copy of a very interesting CV

      CV taken down to add the line:

      + created close to 75,000,000 hours extra work for the bank and it's customers

  36. Jason Hindle

    So, an Error 101 then

    Corporate entity expected a reap/sew dysfunction where no such anomaly exists.

  37. Templar

    If anyone fancies fixing RBS' woes:

    http://www.jobserve.com/CzbNE/

    mmm... market rate.

    1. Nash_Equilibrium

      *People Management skills to manage a "virtual" team of technical, engineering specialists, external suppliers and consultants,"

      Emphasis on virtual theirs.

  38. John H Woods
    FAIL

    {in}experience of ONE person...

    ... is not a good enough reason for this fiasco.

    Consider: "a lone terrorist hacked the system / posed as an employee / whatever and caused a banking crisis affecting nearly 20m people for nearly 2 weeks"

    Even if the poor bugger had done it ON PURPOSE there is NO EXCUSE for what happened next. It is a system failure on every conceivable level and therefore the ultimate blame must absolutely lie with management.

  39. Scott 19
    Facepalm

    'CA Technologies - the makers of the CA-7 software at the heart of the snarl-up - are helping RBS to fix the disaster that has affected 16.9 million UK bank accounts.

    "RBS is a valued CA Technologies customer, we are offering all assistance possible to help them resolve their technical issues," a spokeswoman told The Register.'

    ie: RBS are paying the over time, oh wait that's wrong, the tax payers are paying the over time.

  40. Ru
    Headmaster

    "highly unique"

    Well, this sounds like it should be of concern to CA's other customers. After all, if the problem is merely highly unique, who knows who else it might affect... some company to whom the issue is only mildly unique, perhaps? It is very telling that they didn't say that it was uniquely unique to RBS!

  41. Nelsons_Revenge
    Holmes

    Brave New World...?

    If this is proven to be down to inexperienced offshore IT then I think all western goverments should be very very concerned/alarmed. I guess that it would be very easy for any terrorist organsiation to infiltrate Indian IT companies with an aim of co-ordinating similar attacks from within a UK/USA/other companies off-shored IT systems to bring down banks, insurance companies, supermarkets etc... Total fuck up with little expense. Any CTO's continuing to state that Indian offshore IT staff are bringing much needed experience (especially on the mainframe) should be sacked, or at least shot...

    1. Destroy All Monsters Silver badge
      Facepalm

      Re: Brave New World...?

      Really, don't give the Sociopathic Sick Control Freaks in charge any ADDITIONAL ideas. Ten years ago, someone successfully served cheap blowback - successfully mainly because of rampant careerism and arse-covering inside the FBI, with possibly some interference run by the Ones Who Shall Not Be Named - and we have been oozing towards the Mister Moustache Situation at accelerating speed. Probably will have to back out of this using crowbars and projectile weapons, no restoring from tape here, nope.

      1. Anonymous Coward
        Anonymous Coward

        Re: Brave New World...?

        You are amanfrommars1 and I claim my five pounds.... Just don't try paying it into RBS, thanks?

  42. dubno
    Coat

    off to get more supplies

    I have run out of of popcorn !!!

  43. Anonymous Coward
    Anonymous Coward

    Back in the 80s...

    I worked for a computer bureau that handled processing overnight of foreign exchange transactions for a number of overseas banks in the City. It was my first major job after leaving university.

    There was team of three of us and a manager that looked after a horribly convoluted mass of mainly COBOL programs. The original authors had long since left the company and no one now really understood quite how everything fitted together.

    Each night, the system ran to update everything with all the the deals done that day to produce the reports on which the banks would then base their deals the next day. We would take it in turns to be on overnight call to patch up the system when it went wrong. Hardly a week went by without one of us being woken up in the middle of the night by our beeper.

    Anyway one night the system completely crapped out. Normally the problems were relatively trivial but on this occasion it was a major crisis. The banks went without their reports in the morning and millions of pounds were at risk. We'd tried everything. Even recovering the system back a whole 48 hours and trying to rerun everything didn't work. The strange thing was, we hadn't done any updates to the system recently.

    By lunchtime the next day I was ready to give up. My manager promised me a big pay rise if I could sort it out. Finally over 24 hours after the original failure I tracked down the cause. One of the live programs referred to a file in the test environment, I'd recently done some tidying up in the test area and deleted that file...

    I explained that the problem had been caused by a corrupt file, without going into details... everyone was very thankful that the system was running again, I got my pay rise for sorting out the problem and nobody ever discovered my guilty secret...

    Anonymous for obvious reasons!

  44. Anonymous Coward
    Anonymous Coward

    Disconnect

    That job posting is for a batch administrator. Unless somebody is very very very stupid that job would NOT entail upgrading CA7.

    So both could be true: batch operations could be entirely in India, while software support remains somewhere else.

    Anonymous mainframe guy accountable for 40,676 MIPs of z/OS R1.10.

    1. Anonymous Coward
      Anonymous Coward

      Re: Disconnect

      Moot point. I drive a car. I am NOT a mechanic.

      However, if my mechanic put my gearbox in back to front or left a wheel off or something, I'd like to think that I, as the driver, might notice before I drove off at speed - three nights running according to some.

  45. Anonymous Coward
    Anonymous Coward

    I have met some brilliant offshore staff but

    the last interaction was with some BT mahindra sysops.

    We asked for a patchlevel output for a large sun server. We knew something was odd when the senior sysadmin (who supposedly built, tuned and managed the cluster) asked via email how you did this. We sent him the command and he sent us the response and closed the support ticket.

    The response was NOT a list of installed patches... it was a one liner that this senior sysadmin

    for a business critical billing/reporting system was quite happy with.

    Segmentation fault: coure dump ....

    Lets just say SHF at the next project review meeting :-)

  46. Anonymous Coward
    Anonymous Coward

    WOW, some of you have obviously never dealt with offshored/outsourced staff in India. As someone who regularly has and does, I can promise you that scenarios where the staff running a system don't have any clue about how the system works isn't unusual. The idea that any of them would have a clue where to start in recovering an application, let alone a system, is just laughable.

  47. Rufus McDufus

    "Re: no backup of the schedule?

    What; you've never accidentally deleted an entire file system by rushing an hitting a wrong key?!"

    Basing your entire company's operations even on a single 'entire filesystem' could be construed as a little risky.

  48. Stephen Channell
    Flame

    "An inexperienced person cleared the whole queue" red herring

    a junior operator might have been the root-cause, but to trash the database and all backups sounds more like half the batch had run when the backout decision was taken and the queue needed to be reset to work from restore. bitch all you like about ITIL, but a senior change manager would have taken the backout decision, not an operator.

    Say what you like about Fred Goodwin, but this would not have happened on his watch..

    .. and why was a software upgrade happening mid-week, if not to save on overtime costs for Sunday?

    1. Anonymous Coward
      Anonymous Coward

      Re: "An inexperienced person cleared the whole queue" red herring

      It didn't happen mid-week, it happened on Sunday the 17th June

  49. Mike Richards

    Anyone checked

    Where the BOFH is working these days?

  50. asdf Silver badge
    FAIL

    sorry to repeat

    May have been discussed already but don't feel like crawling through shit flood of posts but why in the f__k for something this critical would you not have a staging system as completely identical to production as possible? %99 of all problems requiring a back out should have been caught in staging which allows you to avoid the $10 a day drone ever touching production systems. Their parents were right Baby Boomers and their shit executive leadership were going to screw over the western standard of living.

    1. asdf Silver badge
      FAIL

      Re: sorry to repeat

      Ok %99 is an exaggeration but the vast majority. Their should have been virtually zero surprises upgrading to new software as it should have been done at least twice already (on a development system with any old foo data (but a significant amount) and it should have been done on staging with somewhat real data and identical to production hardware. Through QA should have been done and then of course the move to production should have been lead by a very senior techy. Very basic stuff that they knew to do in the 1960s. Its what happens when you have someone who should be asking you want fries with that instead saying I know how to make glorious software.

      1. on the sideline
        Facepalm

        Re: sorry to repeat

        Spot on with the message but a bit near the mark on the delivery...

        As one of the unaffected NatWest customers I was impressed with the amount of effort they have made to personally deal with the issues at a local branch level with the extended opening hours etc.

        But the fact remains, in any IT support environment change control and QA should have prevented this from happening. Even then where is the DR plan? and the backup to allow a seem less rollback process?

        Stinks of cost cutting (read: skills) outsourcing to me.

  51. Anonymous Coward
    Anonymous Coward

    This is not new, and (CON)sultants still keep pushing outsource

    if this were the first time a large company's IT went to pot in a push for outsourcing and down-sizing of it's IT functions, then I might have some sympathy for RBS, but it's not.

    10 years ago Zurich (UK) under CIO Ian Marshall put through a whole raft of staff redundancies as part of the push to an outsourced and down sized environment. They ended up with a number of unsupported systems, and a number of poorly supported systems as a result.

    Every year sees at least one other company bitten in the ass by following the outsource everything magic wand hogwash pushed by the "management consultancies".

    Outsourcing has it's uses, but across the board outsourcing of critical business functions, comes under the heading of dumb idea.

    1. asdf Silver badge
      FAIL

      Re: This is not new, and (CON)sultants still keep pushing outsource

      Who cares? The C Suite are a bunch of sociopaths in it for short term gain only for themselves. If they screw up they get a golden parachute and their buddies move them into another opportunity. If they don't they leave anyway in a few years tops to pursue other opportunities (ie. run away from screw ups).

      1. J.G.Harston Silver badge

        Re: This is not new, and (CON)sultants still keep pushing outsource

        If they're so passionate about it, why don't we outsource the consultants?

  52. Anonymous Coward
    Anonymous Coward

    Goodness gracious me

  53. Anonymous Coward
    Anonymous Coward

    We had most support centre staff offshored

    Yeah, McKinsey those shower of useless cunts managed to fool another sucker executive. Now people in the organisation go to Google and try to fix the problem themselves or try and live with it. Idiots in India will make it worse and then say that your computer needs to be reimaged.

    I heard an interesting story ..... one of our guys was over in mumbai and was being showed around the call centre. It turns out that as soon as the offshoring company trains up people and they have had 6 months experience they start asking for 30% pay rise and if they dont get it they go to some other place looking for people trained on the product. The end result was that every time someone in the UK rings up looking for help they got some inexperienced person with <6 months on the job..... who ended up ballsing up their computer and getting it reimaged.

  54. sugerbear

    RBS to sue CA

    According to the FT RBS is considering taking legal action against CA. Unless CA were actually managing the change I dont see how this would work.

    Being a cynic I would say that the RBS PR machine is again at work trying to throw some mud around in the hope that some of it sticks.

    Of course if they do it really will be a "drains up" moment for RBS if they press, so not a bad thing. I suspect it is PR bluster.

    http://www.ft.com/cms/s/0/b03dd574-bf8e-11e1-a476-00144feabdc0.html?ftcamp=published_links%2Frss%2Fcompanies%2Ffeed%2F%2Fproduct#axzz1yvaWQwnl

    1. This post has been deleted by its author

    2. mark daly

      Re: RBS to sue CA

      Best of luck with that. Unless it is a problem with the software that CA knew about but had failed to document on their Customer Care site then RBS have not got a prayer of winning a court case, particularly if they heavily customise the software. I am pretty sure that the supplier will be able to point to a number of successful implementations of the product and will be able to show that if RBS/NatWest staff had actually read the install instructions together with the associated product manuals and checked for any known issues they would not have had the problem Maybe I am old fashioned but I thought this sort of due diligence was part of an IT technicians job.

    3. This post has been deleted by its author

    4. Anonymous Coward
      Anonymous Coward

      Re: RBS to sue CA

      Suppose, hypothetically, that the patch/update was faulty. Suppose, hypothetically, that the person responsible used the "backout patch" function which had been used previously, tested previously, but for this particular patch revision was incorrectly implemented and actually damaged/destroyed the underlying database. Would a client be expected to test the patch backout function for every patch revision issued before using it?

      1. Anonymous Coward
        Anonymous Coward

        Re: RBS to sue CA

        We can all hypthetically guess at what had happned. Information supplied by a bunch of disgruntled RBS employees that can tell you what probably happened, curious IT types that still work at RBS and can use the incident search function and a handful of people that "where there man" . Somewhere though there is 1 person that can tell you what they thought should have happened. I want to see his post.

        I fear for this person. Soon his CV will be flying across some recruitment agents desk. Next they will be sat next to you. Your "jobs a carrot" little support contract turns into a nightmare.

        FYI I am not involved in anyway. I just have been dealing with this crap for years. Long may it continue and keep us in contract jobs, F40s and lambos, cheap women and tax avoidance schemes.

        Carry on!

      2. Anonymous Coward
        Anonymous Coward

        Re: RBS to sue CA

        Yes, of course. Thats what Pre-Prod (UAT or whatever) Test is for. Test the deployment, check it works as designed, *and* test the backout. If the backout f*cks you up, you don't deploy the damned fix, purely because of the risk that you might need to do it in live. You can insure yourself against it (hot standby systems etc) - but ultimately it comes down to a balance of chance of it going tits-up, against consequences if it does. In this case, the consequences were pretty damned big.

        Now, if a backout that was successfully tested against a live-like environment failed in live, either someone didn't follow the process - or the live-like environment wasn't fit for purpose.

        In either case, its valid grounds for the el reg lynch mob to get their pitchforks out...

    5. Dun coding

      Re: RBS to sue CA

      This is a classic piece of FUD and will be quietly dropped somewhere down the line. "Let's spread the shit as wide as you can and hope that it some of it sticks to someone else!"

      This is like trying to sue the manufacturer of your screwdriver because you didn't do a screw up tight enough.

    6. Anonymous Coward
      Anonymous Coward

      Re: RBS to sue CA

      RBS suing CA?

      They had better be very careful there. I once worked at a place that managed to upset CA.

      I don't know how they managed that, maybe management was looking for an excuse, but CA's response was to cancel all licenses and refuse to supply the customer with any more.

      We already had alternatives in place so it wasn't a big deal, but for RBS?

  55. Steve Button
    WTF?

    WARNING NEEDED

    Offshoring massive parts of your IT Department could lead to serious fuckups. Are you sure you want to continue (yes / no) ?

  56. Anonymous Coward
    Anonymous Coward

    5 November 2011

    The same thing happen before on Nov 5 2011 right?

    If so, why weren't lessons learned so this couldn't happen again? (I think we all know the answer to this!)

    The mainstream press seem to have conviently forgotten about the 5 November incident!!

  57. milo108

    Milo108

    Some years ago I helped RBS work through a storage device failure which led to their having to recover a DB2 database. The outage lasted for several hours, ATMs were unavailable and they made the BBC news.

    The failing device was fault-tolerant and yet a sequence of events which imho nobody on the planet could have predicted caused the unit to fail completely.

    The bank's staff worked tirelessly and diligently over many days to work with the vendor to pinpoint the root cause and then replace modules in similar devices over a planned period.

    These were highly-skilled IT professionals who had the knowledge to collaborate with the vendor to work through the problem

    Fast forward to 2012. If this really is a case of inexperienced operators or systems programmers blundering with CA-7, a mature scheduling solution, then I agree that they have reaped what they have sown. Core banking systems are complex beasts and require more expertise than Java, MySQL and Linux.

    You get what you pay for.

    1. Sir Runcible Spoon Silver badge
      Joke

      Re: 5 November 2011

      "The mainstream press seem to have conviently forgotten about the 5 November incident!!"

      They should always remember, remember, the 5th November!!

  58. Anonymous Coward
    Anonymous Coward

    FSA need to act

    The FSA need to be checking all uk banks & building societies that have been outsourcing or offshoring core IT functions to check they're fit for purpose.

    A number of outfits are watching this saying 'there but for the grace of god', as they watch the fuse burning down on their own unsupported legacy timebombs.

    A couple more catastrofucks like this will do permanent harm to confidence in our retail banking system. FSA get a grip on the situation now.

    1. Anonymous Coward
      Anonymous Coward

      Re: FSA need to act

      Haven't been in investment banking IT for a while but is there any "offshore" supported investment systems?

      1. Anonymous Coward
        Anonymous Coward

        Re: FSA need to act - Offshore Investment Banking IT

        JP Morgan - very messy

  59. QuiteEvilGraham
    Megaphone

    Well boys & girls

    The Today program on R4 this morning (Wed 27th) Let's have a listen.

  60. Winkypop Silver badge
    FAIL

    RBS: "No evidence" this is connected to outsourcing

    Well, no evidence you are going to see anyway....

    1. Tom 38 Silver badge

      Re: RBS: "No evidence" this is connected to outsourcing

      He keeps saying 'no evidence this is connected to outsourcing', but is keeping mum about whether it is connected to offshoring.

  61. kapple999

    RBS don't seem to understand basic book-keeping

    RBS is reported as saying that all Customer Accounts are back to normal. They may well be from a monetary perspective, but the data still remains corrupt.

    I now have on my Account 3 BACS credits, one for nearly £500, all with a generic description of “RBS TRANS 220612”, which fails to identify the Customer or the Invoice number.

    As I have over 200 Customers who might have paid me last week, I can’t be sure which ones these represent.

    Furthermore I now have a Direct Debit for over £220 with the same description “RBS TRANS 220612” even though it was raised on 25th June. I don’t know who has raised the Direct Debit, nor what for – I’ve never had a D/D for this amount before.

    I presume some programmer somewhere has simply decided to apply a generic description when they’ve lost the original description – the Bank balance is presumably now correct, but it means I can’t reconcile my Ledgers.

    1. Anonymous Coward
      Anonymous Coward

      Re: RBS don't seem to understand basic book-keeping

      Oh ffs, so they've not just created a right royal fuck-up in the first place, but now they're fucking up the recovery?

      Jesus wept, the first fuck-up was perhaps bad enough, but this is inexcusable.

    2. cantankerousblogger
      Black Helicopters

      Re: RBS don't seem to understand basic book-keeping

      Your post is really important and I linked to it here:

      http://cantankerous.co.uk/?p=779

      I was really alarmed when I saw the 99% rather than 100% figure quoted in the media and the situation appears to be even worse at Ulster Bank. The journalists seemed quite happy to take the RBS spin at face value but I thought "My God, RBS must have lost or corrupted data when it attempted recovery". 99% recovery is a failure, not a success. It is hard to believe RBS is listing transactions with only a date rather than some unique identifier. I don't think we can assume that the balance is now correct. Do we trust RBS to get it right? Who audits this? One per cent is an awful lot of money. This could be terminal for RBS, so you can see why Mervyn King is so involved and angry. This is the utility side of banking that people need to be able to have confidence in. Expect this story to explode, eventually, in the mainstream media.

    3. QuiteEvilGraham

      Re: RBS don't seem to understand basic book-keeping

      Well, their batch run is intended to be overnight. This implies that file allocations in all the various jobs which comprise the batch are designed for the typical volumes they expect (IIRC they have an double-sized run one night, since there is one atypical run in the schedule, might be Saturday or Sunday, can't remember now).

      So they will have to stage the now greater than normal volumes of data somewhere in the meantime. Again this will be atypical, and one can only speculate exactly how it is done. They won't have done it often, although it may well have been tested out.

      It may well be this process which is putting in the generic descriptions described by the poster.

      Of course, if these transactions are masking descriptive data which other parties rely on for their own processing (which is what it sounds like), then the knock-on effects will be, to say the least, unpredictable.

  62. Anonymous Coward
    Anonymous Coward

    Due to lack of experience he didn't get "the jitters"!

    After 25 years in IT I get "the jitters" when I go near a live system, that little bit of fear at the back of my mind, a shedload reminders of the stupid mistakes that experience has taught me. It's that, that makes me think twice, check twice before I bang enter on a potentially dangerous command on a live system.

    1. Anonymous Coward
      Anonymous Coward

      Re: the IT "jitters"

      > After 25 years in IT I get "the jitters" when I go near a live system

      I'm plesantly suprised every time I switch on my `computer' and it actually boots up .. :)

    2. FreeTard
      Thumb Up

      Re: Due to lack of experience he didn't get "the jitters"!

      I hear ya mate. Anytime I go near systems for which operations I have done thousands of times, and have a number of backup procedures in place - I still get the shits.

  63. Titus Technophobe
    Thumb Up

    Driving the news .....

    It is interesting for a change to see that The Register, and even the comment boards, seems to be driving the news on the RBS computer glitch.

    The CA 7 software, the advert for the man in Amberpet, and so on all seem to have appeared on the comment boards, the various more traditional news papers, and finally the BBC.

    It would be nice if RBS actually said what went wrong, aside from the sheer length of the outage caused by this computer glitch, it seems even more astonishing that nobody senior at the bank has any idea what caused the problem.

  64. Sirius Lee

    RBS spokesperson said...

    Last night on C4 'news' an RBS spokesperson seemed to me pretty catagorical that the source of the error and it's management was in Edinburgh.

    1. Anonymous Coward
      Anonymous Coward

      Re: RBS spokesperson said...

      Well of course it was - I assume the boardroom is in Edinburgh?

    2. Anonymous Coward
      Anonymous Coward

      Re: RBS spokesperson said...

      The statement only seemed to refer to managerial control and oversight. No mention of the physical location of those actually undertaking the work.

      RBS's statement would appear to be of little value.

  65. Anonymous Coward
    Anonymous Coward

    Where are they now

    And now some of the brains trust that was at RBS are now at ANZ bank in Australia doing the same thing.

    Testing services are being outsourced to Capgemini

    They never will learn !!!

  66. Anonymous Coward
    Anonymous Coward

    Utter madness

    This was an accident waiting to happen. The IT outsourcing strategy in RBS is corporate suicide - An incident of this nature has been predicted by those working in IT for a very long time - RBS have dodged several bullets recently already, their luck has now run out.

    As always the management team are running around trying to sweep the outsourcing issue under the carpet - Mr Hester a word of advice challenge & challenge again your IT Managers for the real incident review not a fabricated one that covers up their fatally flawed outsourcing approach - Maybe also speak directly to the administrators to get a real understanding of the outsourcing issues & the risk this puts on your business. Sadly this will likely be the first of many incidents on this scale

    RBS get your head out of the sand or watch this space for further business impacting carnage caused by incapable offshore staff

  67. Anonymous Coward
    Anonymous Coward

    offshoring

    http://www.banktech.com/management-strategies/171201566 - sound familiar? a policy introduced by a certain Mr Teerlink whilst at ABN Amro, anybody wish to comment on the state of those services now?

    Where did Mr Teerlink go after ABN AMRO? surprise surprise he joined RBS not long before their massive offshoring strategy began

    http://www.rbs.com/about/board-and-committees/ron-teerlink.html

    This issue wasn't caused by offshoring, but would the recovery have taken 7 days + if all the experienced resources were still in the UK? This is the question Mr Hester and the FSA should be asking.

  68. Anonymous Coward
    Anonymous Coward

    bank doesnt care really

    i remember a policy from the US that any critical IT that could effect the security of the US was to be done in the US. Saw some trading systems move from the UK to the US.

    Obviously UK doesnt view anything that stops joe public from accessing money as critical. A bank casino fraction shaving system (did i just invent the term bcfss?) on the other hand and heads would be rolling Mi5 would be called in. The usual suspect rounded up/stuffed in bags etc. etc.

  69. Anonymous Coward
    Anonymous Coward

    Basic Skills required..

    Shame this single 'lower rank' bloke is being single out. Thats not cricket :)

    IT director should be fired. That half-aubergine probably went along with the outsourcing whilst on a golf course ie. removing people with countless years of skills but more importantly didnt implement some basic contingencies. Granted I havent worked on such a large system as RBS but I have rolled out to about 10,000 users and by all things holy, I always insisted that the old system ran in parallel. In full. Even if this meant some serious expenditure to re-create the live environment. The fall back environment doesnt have to be the all singing all dancing live version, just big enough to do the job. At least it will work - though perhaps not as fast.

    The bill RBS will get for this will far exceed the cost of setting up that fall-back environment. Its possible a System Designer did advise the environment but a bean counter could have rejected it due to cost. And what cost now?! They are indeed running on some old(ish) kit but thats no excuse. You can find ways to make sure that is something catastrophic happens, you can switch to another environment - perhaps belonging to another organisation....anything at all!

    The IT Director should have had a very close eye on this as it was so big. They werent upgrading Microsloth Office internally, they were affecting 15million+ people. Thats big!

    Sorry, got to be AC on this...a bit too close to the action....

    1. Anonymous Coward
      Flame

      Re: Basic Skills required..

      You should always be very, very critical to all you hear from the "management talent". In reality, all they really know is how to do politics. They don't even manage to consult with their subject matter experts and follow their advice.

      They have all sorts of funny, nicely worded Powerpoint files that are stuffed full with terms like "Business continuity strategy", "failover datacenter", "holistic security approach", "full redundancy". But if you talk to the people who would actually implement this, they will tell you three dozen horror stories about *reality*.

      For example, a major (if not the biggest) European railway operator has two datacenters for billing. Managers think they are redundant and if one burns down, the other takes over. System admins know this is not true. Then managers have bought Diesel Backup generators. But they say "we don't have time to ever try out the Diesels, as we cannot schedule possible downtime for that purpose". Last time the Diesels kicked in, they stopped almost immediately, because the power load was too much for them.

      So they claim to have something like a firebrigade, but the firefighters have never, ever done a training mission.

      I guess in the next couple of years there will be a few days of "free ride" on the network of that railway corporation, because they are simply unable to print any tickets.

  70. Anonymous Coward
    Anonymous Coward

    I can conform Ulster Bank is still screwed

    My wife is the accountant for a small 30+ company who use Ulster Bank. They should have been paid on Thursday but by yesterday still hadn't. Ulster Bank then got them to redo the whole payroll live and everyone got paid. However they admitted there was still a chance the original payroll run may kick in again and everyone will get paid twice, although they will get the money back if that happens.

    I got paid today and everyone who banks with Ulster Bank didn't get any money in their account.

  71. Anonymous Coward
    Anonymous Coward

    Enough with the outsourcing bashing already.

    We've all made mistakes. People who say they have not are primarily lying to themselves, which immediately makes them unemployable in my book. Its one of my favourite interview questions.

    The question here is more along the lines of "how does the individual/company react/handle when they realise they've made a mistake". The right answer is to stick your hand up immediately and say "I messed up". When you're in a hole, stop digging. Save your terminal history. Step away from the keyboard young padawan.

    The second lesson here is that your rollback plan is just as important than your implementation plan (for systems this critical). It must be well understood by the implementer and peers and management. What DOES the organisation do when sh*t hits the fan ?

    Culturally - I have noticed (this is far from exclusive to the indian subcontinent mindset) for an error to be seen as personal failiure or incompetence. What happens is the erree and his closest mates huddle up, formulate a private plan and keep tinkering.

    Now lets compare two worlds. Ok, when banks mess up it inconveniences a lot of people and bad things have happened to a lot of people. However, what would you say if something REALLY important (life support systems, realtime systems controlling traffic or trains or planes or the power/electricity/gas grid) suffered this kind of issue (which is primarily mismanagement), not human error.

    So some junior had an accident. It happens to everyone. What is key is how well that organisation can mitigate it, limit damage and restore service. It looks like (change) management failed badly here. Nothing to do with the geographic location or nationality of the staff who had the accident.

    It could have it be something as simple as the operator not being aware of something like this "alias rm='rm -rf'" (or whatever that is in mainframe speak). Nice little timebomb there.

    AC because I admit I've made mistakes requiring tape restores too.

    1. Anonymous Coward
      Flame

      Missing The Point

      The point people make here is not the technical glitch. The point is, management have removed all the competent people, who would have fixed this issue in a few hours.

      The western world has come into the crapper because the modern day elite is intellectually challenged to run anything as complex or more as a shithouse cleaning operation. I am sure these MBAs can even fuck this up.

  72. geekclick
    Pint

    Sky News Qiuoteing el Reg

    Was watching last night during the 23:30 Press Preview. They were quoting el Reg and even mentioned the site and commended them for the story! Beer all round boys :)

  73. Bob Asic
    Paris Hilton

    Where's Paris Hilton when you need her?

  74. Anonymous Coward
    Anonymous Coward

    you cant silence them all

    People should compare and contrast how this stories is explained as a 'glitch' when millions of people have been buggered up ,and other dross non stories that cause media and political outcry for months even though nobody has suffered.

    This one will run and run.

    There must be many ca7 experts that have been put out to pasture that can shoot holes in any feeble explanation .

    1. Cokezero

      Re: you cant silence them all

      Yep , heres one here. Lowly operatives in Hyderabad do not do Ca7 patch sysprog updates. Thats a good place to start.

  75. Tom 15

    Meh

    Let's be fair, we've all run a DELETE command in our time and forgotten to put a WHERE clause on... and it's one of those things you only ever do on a Live environment.

  76. baldynapper
    Stop

    Governance

    Having worked in this area for some time, either the CAB didn't pick up that the back out plan was flawed or the operative made a mistake, I would like to think it was the latter otherwise all RBS mainframe changes are going to be halted, which would cause major problems.

    1. Anonymous Coward
      Anonymous Coward

      Re: Governance

      Interesting. I've sat on CABs, and have the benefit of twenty odd years of getting most stuff right, and some stuff wrong... Its called experience. I've also got the ITIL badge - and know how it should work...however...

      The problem is that most CABs don't contain people who have any in-depth technical knowledge of the matter concerned. They look at the change record, look at the track record of whether its gone tits up in the past, look at the window to see if it'll fit - and then rubber stamp it if they've got a warm feeling. I once did a QA role for a major investment bank who'd gone though a particularly patchy change period. My role was to act as a gatekeeper, QA'ing changes in a particular area - with the intention of making sure that they'd been properly designed, tested and documented. The initial results were 90% rejected, 10% let through. And I was being fairly generous with the ones I passed...

      I've also had a lot of experience in dealing with Indian outsourcing (and offshored) staff. The good guys exist, but they're rare. Their cultural psyche seems to be that they will never, ever tell you that they don't know the answer to a question. They never admit to lack of knowledge. Ask them "will it be OK", or "are you sure you understand about the backout", or "do you know how to check whether it's succeded?" and they'll invariably say "Yes", even when they patently haven't a clue. The only way of protecting yourself against this naive complacency, is to (i) question them technically, and in great detail - so they're made very aware that bullshit won't wash... and (ii) employ BOFH's cattle prod. Sadly, RBS et al have let go of the guys who could ask those sort of questions, and they're unwilling to admit to their own mistakes, or apply the cattle prod either.

      Am I alone in thinking that these disasters are going to be just like buses? Stand back, there'll be another along in a minute...

      1. Anonymous Coward
        Anonymous Coward

        Re: Governance

        100% spot on - I left 3 years ago as the writing was on the wall - I'm extremely positive, but there comes a time when the Indian strategy just simply wasn't working - no one wanted to listen, and the CIO was blinkered in his views - I'm sure the 'day rate' saving of staff in India v London had the board drooling - Pity the 'real cost' werent in that same strategy presentation and high-performing employees views were stamped on.

        I still have plenty of good friends and colleagues at RBS and the incident was just a matter of time, they have dodged a few bullets already, only a few weeks ago a potentially fatal application got deployed to +12,000! servers, business critical services like this bodged deployment are now in the hands of offshore teams. Interviewed by the London team, who were simply told to employ no matter what ("as long as their English is good enough" was the directive)

        Absolute madness - And yes mark my word; like buses there will be another major incident around the corner...

        I hate to say I told you so........

  77. Anonymous Coward
    Anonymous Coward

    Drone bullied into confession

    Am I the only one thinking of Ivan Dobski ? (monkey dust for the fools that missed it)

    I only said I done it cos the put my knackers in a toaster etc etc

  78. Anonymous Coward
    Anonymous Coward

    new bunker vid please

    Actually can we have a hitler bunker vid subbed with the RBS mess

    1. Fatman Silver badge

      Re: Hitler

      I am sitting here wondering if one of those so devilishly creative types would do a "Hitler parody" on this fuck up.

      I can just picture the subtitles now!

      Gentlemen, get out your word processors (damn, I just `dated` my self, didn't I).

  79. raving angry loony

    Exclusive?

    Really? Because I saw this in the Daily Mail (yeah, I troll there...) yesterday.

    As for the rest, RBS is lying through its little yellow teeth if it claims it didn't happen because of outsourcing. They fired the people who knew what was going on, and hired a bunch of people who didn't, but were willing to work for far, far less.

    1. Anonymous Coward
      Anonymous Coward

      Re: Exclusive?

      Far Far less maybe !!! - But the CIO wasn't calculating Apples for Apples.......You get what you pay for...in this case a complete balls-up

  80. Cokezero

    Fascinating that anything typed here concerning the word offshore can and will be picked up by half the worlds press. Probably covering old ground here but woul a "lowly operative in Hyderabad" really be responsible for a CA7 software upgrade? Those roles are batch admin, not sysprog. Of course ive no access to the real incident report but it takes real concerted effort by a batch administrator to remove all jobs from a CA7 queue, assuming thats what happened. If the schedule definitions are missing the batch cant run. I'd love to see that incident report but the whole offshore thing just doesnt seem to ring true. I can envisage a scenario where an Op Cancelled a failed job in Ca7 instead of doing a force complete. A cancel doesnt satisfy job requirements, it removes the whole thing and subsequent actions from a schedule. Christ knows.

  81. johnbandettini
    Happy

    As a recently ex-RBS staff this was so easy to spot happening. They have moved something like 80% of some IT functions to India. The staff there are mostly inexperienced and not familiar with the banks processes. This is not the first problem originating in India (just the highest profile one) and it certainly won't be the last such problem.

    When all the job loses where announced and the plan to move so many jobs to India, the head of It was asked what the backup plan was in case it did not work. His arrogant reply was 'There is no backup plan because it will work'.

    Nice work so far guys.

  82. Alf Tonks

    Blaming outsourcing is simplistic but...

    I've worked in IT for, 15+ years now, starting as an analyst programmer…moving onto Java, luckily coinciding with the dot com boom times, enjoying the 'wild-west' approach when JFDI was probably truly at it's height (e.g. bouncing the live app server not giving two hoots to any customers using the service at the time, copying .class files into the deployment directory etc..)

    As the dot com boom imploded (btw is the f*ckedcompany.com website still going?) I was then present in the whole protracted move to 'outsourcing' where over a couple of years it went from highly productive 'in-house dev', i.e. the ability to turn things around in an afternoon…to the other end of the spectrum i.e. 'yes you can update the name on that button in the next governance cycle in 6 months time' (a real example).

    I think it's simplistic to blame 'outsourcing' for everything, but anyone who's been involved in it will usually mention the additional overheads it incurs that seem to get over-looked (e.g. the need to document everything to the nth detail, otherwise you'll 'get what you asked for' and common-sense won't get a look in.)

    I've also been in what could be referred to as the half-way house of outsourced hell..this is where I think the whole 'outsourcing' model starts to take the piss….this is where in-house people are tasked with defining solutions, but are reliant on the suppliers to provide the knowledge of the systems…as an analogy it's like trying to assemble an Ikea wardrobe with no instructions and half the bits missing. When suppliers have 21 days to even respond to a query you can imagine the level of productivity that ensues...

    I live in hope that those IT people who have 'come up through the trenches' will one day be in positions of power, and you won't have to justify/explain everything endlessly, producing 1-pagers and ppt slides for IT directors to give their stamp of approval to. What's the quote I saw the other day "for those who know, no explanation is necessary…for those who don't know, no explanation is possible". Something like that anyway.

  83. heyrick Silver badge

    Should I bother writing?

    With nearly 300 messages, is anybody reading?

    Anyway, my take on this is they are lining up to fire some hapless poor performer, "lessons will be learned" etc.

    For this sort of situation to carry on like this, it's the directors who should be out on the jobless queue. IMHO.

    1. Anonymous Coward
      Anonymous Coward

      Re: Should I bother writing?

      I doubt many are still reading.

      My two pence? I was on the mainframe @ RBS for 2 yrs, used CA7 a lot but wasn't responsible for it, probably know the guys involved in this but I'm not daring to ask just yet!

      I got offshored, used to laugh when managers came back from interviews in absolute shock at what RBSI or whoever had put forward for a role, they didnt know the basics but management eventually got told to relax their criteria otherwise they'd never be able to hire anyone and therefore make the UK guys redundant.

      Its totally true about the high churn and the 6 month cycle is fairly well known.

      I'm suprised its taken this long for something so massive to go wrong to be honest.

      1. Anonymous Coward
        Anonymous Coward

        Re: Should I bother writing?

        Churn!

        24 of us on shift, at a different well known bank. Rarely did anyone ever leave.

        It's a culture which breeds experience which in turns knows instinctively just how to deal with batch processing cock-ups.

  84. Anonymous Coward
    Anonymous Coward

    Telegraph credits TheRegister unlike Granuiad

    Nice to see MSM reads elReg

    "According to technology website The Register, at least some of the team responsible for the error were recruited in India following redundancies in the department in the UK."

    http://www.telegraph.co.uk/finance/newsbysector/banksandfinance/9358252/RBS-computer-failure-caused-by-inexperienced-operative-in-India.html

    Daily Mail, Sky and Telegraph credit TheRegister while Guardian "Investigation" copy & paste doesn't. Maybe if they stopped paying Polly Toynbee they could afford to do their own investigations.

    Anna Leach, are you pursuing GMG for copyright infingement?

  85. Jay 11
    Alert

    I wonder?

    If it is a batch file problem is it the same kind that had caused trouble for Paypal in the past with their RBS backed prepay debit card?

  86. Anonymous Coward
    Anonymous Coward

    The view from India of the same event!

    Hi,

    I'm an Ulster Bank customer in Republic of Ireland.

    Not surprisingly, there is a different spin being put on the same story in India:

    http://timesofindia.indiatimes.com/business/india-business/Offshoring-blamed-for-RBS-onshore-glitch/articleshow/14451794.cms

    The Times of India don't mention that bank customers cannot access their account balances / transaction history for over a week - this is surely the reason for the upset and not that Obama wants to get elected.

    AC as I am involved in interactions with India offshored functions

  87. Anonymous Coward
    Anonymous Coward

    Hester is lying imo

    Pains me to say this but how can he categorically state it is nothing to do with taking jobs out to India? I have totally lost confidence in the man - I would have respected him if he had come clean and said it was down to Indian resource.

    Yes, the same human error could have been made by somebody in the UK but far less likely due to 1000+ years of experience.

    When politicians get caught lying they are forced to quit? Will this mean the same for Hester?

    Whatever, the Director of Technology must go. It is solely his responsibility.

    One last comment. Can all please refrain from talking about outsourcing. The issue is offshoring. The jobs have not been outsourced - the jobs are performed by RBS employees based in India

  88. Innocent question
    Coffee/keyboard

    Flight to safety

    As an interested non IT bystander, safe ;-) in the arms of LTSB and Intelligent Finance, is there any bank whose systems are sound? And dont have appalling prehistoric legacy problems?

    For example the new Virgin Bank - will it have brand spanking new systems or just piggyback on the hoary old systems of the big boys?

  89. adam payne Silver badge

    So this whole mess was caused by an "inexperienced operative".

    Why was so a called "inexperienced operative" doing an upgrade?

    Should upgrades to important systems be done by experienced people?

This topic is closed for new posts.

Biting the hand that feeds IT © 1998–2019