back to article Have a Plan A, and Plan B – just don't go down with the ship

When planning for disaster recovery, our natural inclination is to focus on the technical design. We work to strike the perfect balance between controlling infrastructure spend and the required capacity. Technical considerations are of course paramount – replication schedules based on delta changes and available bandwidth, the …

  1. Anonymous Coward
    Anonymous Coward

    other drivers for a plan

    We need to have a business continuity plan in place to meet our regulatory requirements as a pharmaceutical manufacturer.

    In effect, it's a legal requirement for our regulatory affairs manager to be able to keep working should our office lose normal comms, IT etc.

    To all intents, it's our IT disaster plan. Though I must admit I haven't documented thoroughly. I wont comment on whether we would have any plan if there wasn't the threat of an inspection by the regulator.

    1. Anonymous Coward
      Anonymous Coward

      Re: other drivers for a plan

      And what's to stop someone who realizes they could have such a capability in the event of a disaster from contriving a disaster so as to obtain that capability? A kind-of disaster-oriented industrial espionage?

      1. auburnman

        Re: other drivers for a plan

        Jail time? Pure Industrial espionage would largely be a civil matter or at worst "white collar" tricky-to-prove crime like fraud. Engineering a disaster that triggers a recovery plan however would involve some fairly serious criminal damage at the very least.

        And anyone who was mental enough to try something like that would soon find they don't have the time or the privacy to go rifling through data as some fairly important eyes would be on them asking when the systems would be back.

        1. Anonymous Coward
          Anonymous Coward

          Re: other drivers for a plan

          "And anyone who was mental enough to try something like that would soon find they don't have the time or the privacy to go rifling through data as some fairly important eyes would be on them asking when the systems would be back."

          So you play it cagey. You don't pilfer the stuff right away; just insert some extremely covert stuff while you have the clearance and then fix things tootsweet. After all, you don't want to raise alarm bells, and you've installed the hole in the wall, one can just sneak out a bit at a time, mostly likely disguised as legitimate activity or behind an encrypted tunnel.

          1. Vic

            Re: other drivers for a plan

            You don't pilfer the stuff right away; just insert some extremely covert stuff while you have the clearance and then fix things tootsweet.

            There are assorted ways to log all shell activity. That means that the admin's operations can be examined afterwards - both to guard against such mischief and also to make sure that the instructions given were followed correctly.

            Vic.

      2. Anonymous Coward
        Anonymous Coward

        Re: other drivers for a plan

        The Reg Affairs Manager has to be able to handle medical enquiries and/or quality issues about our products that are on the market and communicate with the regulator and our customers (the wholesale/distribution network).

        Nothing novel about the products, so no competitor advantage to be gained in this case. But it's an interesting conjecture.

  2. BenBell
    Pint

    Cheers

    This is now doing the rounds at the office - Thanks for this :)

    1. chivo243 Silver badge

      Re: Cheers

      A colleague and I just went through a Risk Inventory Meeting. They asked about our DR Plan, and I batted the birdie to the budget guys...

      Not my responsibility.... Yet?

  3. Velv Silver badge
    Boffin

    Don't forget about the less obviiois mandatory items: People need to eat, sleep and poop.

    What facilities are available for your staff while they dig you out of a hole.

    How do they get to the site, is there public transport or parking? Have they got cars? Do you need to hire cars, or run busses from somewhere else? Do you need to provide nearby accommodation? Is there food on site or nearby? Is it open 24/7? Do you need to go to Tesco every day and buy 2 dozen mixed sandwiches? Is there somewhere away from the work environment they can chill for 10 minutes? And are there enough toilets? Sounds crazy to consider, but if you turn up at your DR site with a dozen techies and there's only one outdoor portaloo, things are going to get messy.

    1. 27escape
      Flame

      Adding in

      A list of people who should be onsite fixing the issues. Some of the workers sure, not all of them and certainly none of the management getting in the way asking is it fixed yet!

      Had this happen a few years back, massive fire took out most of the factory, the recovery crew was 30 odd of our 250+ staff, none of the directors were allowed on site as it was essential staff only and they would not be contributing to the actual recovery actions.

      Flame icon, cos thats what happened

  4. Anonymous Coward
    Anonymous Coward

    Plan A

    * Deny any and all responsibility. Shift blame as appropriate.

    1. Fatman Silver badge
      Joke

      Re: Plan A

      <quote>* Deny any and all responsibility. Shift blame as appropriate.</quote>

      Taken from the Jim The Boss School of Damagement.

  5. Anonymous Coward
    Anonymous Coward

    Plan B

    * Have my CV always up to date.

  6. Anonymous Coward
    Anonymous Coward

    Plan C

    * Go down with the ship. :-(

    1. Fatman Silver badge
      Joke

      Re: Plan C

      <quote>* Go down with the ship. :-(</quote>

      NO, NO, NO

      Chain your damager to the rail, and jump ship before it goes under.

  7. Ironclad

    Active-Active

    Worth considering the use of an 'active-active' system where two duplicated systems are both active with delta's applied between them to keep the databases in sync,

    Quite common in the payments industry.

    If you lose one system at least 50% of your terminals/access points are still working and your database is fully intact and up-to-date without needing any manual intervention.

    You can then manually (or automatically) swap the remaining connections to the still running site.

    You will still need well documented procedures and processes but it can take some of the panic out of recovering,

  8. allthecoolshortnamesweretaken Silver badge

    Plan 9 (from the German class)

    Mach nur einen Plan,

    sei ein großes Licht!

    und dann mach 'nen zweiten Plan -

    gehen tun sie beide nicht...

    - Bert Brecht

  9. Anonymous Coward
    Anonymous Coward

    I worked for a large insurance company in the early 2000's. The site I worked at contained the mainframes that supported a significant chunk of the consumer financial product and pensions side of the business and around 2500 staff.

    They decided to introduce the IT department to the non IT (personnel and logistics) aspects of disaster recovery by talking through a scenario where a huge storm was forecast and the site was subjected to local flooding and high winds. Power and phone lines went down and staff were unable to leave the premises due to fallen trees.

    That this was the first time the company had thought about this was obvious from the start.... because the first part of the site to get wiped out was the backup tape store which was in a small well secured enclosure on the edge of the site, right next to the river and surrounded by tall trees.

    There was no offsite backup....things went downhill from there!

    1. Anonymous Coward
      Anonymous Coward

      But you have to wonder what good is an offsite backup when your site only has one way in or out which is vulnerable to being cut off in the event of a disaster. How do you allow for a multisite backup when physical isolation is a real possibility?

      1. Robert Carnegie Silver badge

        If Site 1 isn't accessible, it's about the same as having it knocked out and you have to switch to Site 2. If however you can't switch Site 1 off to disable.it, consider air strikes. I think that was a Doctor Who plot more than once, in the old days, and also whichever Doctor bombed Downing Street:

  10. Paul Crawford Silver badge

    Don't forget UPS arrangements

    How many of you have pulled the Big Red Knob on the master switch to the building/campus to see what really happens when the mains fails for more than a second?

    Do the UPS hold up the machines but not the A/C systems?

    Is there enough emergency lighting and torches (in working order) to get around and do stuff like check the power outage is not one of your own breakers tripping on a now-cleared fault?

  11. jcitron

    Many years ago I worked for an insurance company which had a DR plan in place. Since I worked the Wednesday through Sunday swing shift, I got to participate more than once.

    The drills were random and could happen anytime on a weekend. They were used to test the complete recovery and uptime of the systems we were using at the time. The mainframe system was offsite since the beginning so all that was needed for that was a network connection.

    We sent our tapes to an off-site storage firm. Initially it was a small company which failed the test only once and lost the contract. After that it was a much bigger company who came in daily, including weekends, to pick up the nightly backup tapes from the previous night's backups.

    During the tests, we would get a call, that the systems were down. Being in computer operations in the computer room, we would find ourselves in a locked-down type of situation where there was no local network access as someone had pulled the plugs to the concentrators.

    The system at the remote location was brought up on line once the tapes arrived and the tapes were restored to the servers. The systems were then brought up online and we were then told to connect to the system and run the batch jobs we were scheduled to do so, except they were running off of the remote location. Everything went as planned.

    The only glitch we had was that offsite company who messed up and didn't respond to the phone call.

    Granted this was 25 years ago now and stuff was a lot simpler. The servers were '486 Novell servers and the tape backup was perhaps two tapes maximum at 1.2Gigs each.

    What's interesting is how this went together like clockwork and how well everything was rehearsed to ensure it did every time. We were usually up and running within a couple of hours.

    I have always maintained good backups of everything, even at home as well. The important thing to remember is hardware can always be replaced, software can always be reinstalled from media or these days downloaded off the web. The data is the most important thing to backup, and without a good backup of the most important thing, you have nothing.

  12. smartypants

    Most important thing:

    Have management who continually ask themselves if they have the staff who can do all this plan A plan B business, and ensure that said staff can demonstrate they're capable of it.

    That's the hardest bit, because none of this huge effort has anything to do with the ordinary day-to-day running, and it takes a leap of imagination that is often sadly lacking among upper management...

    1. Fatman Silver badge
      Joke

      Re: Most important thing:

      <quote>That's the hardest bit, because none of this huge effort has anything to do with the ordinary day-to-day running, and it takes a leap of imagination is not conducive to increasing shareholder value that which is often sadly lacking among the sole focus of upper management...</quote>

      FTFY!!!

  13. Oengus Silver badge

    Where is your disater recovery/business continuity plan?

    I used to work for a major bank. We had hot standby systems, network connections, automatic fail over and everything else that was needed for the major systems. The DR/BCP plans were tested on a regular basis including running for periods of time on the backup systems then initiating controlled switch over to the primary systems.

    The problem was with some of the "lesser" systems (one inventory management system in particular comes to mind) that ran in locations other than the main data centres. We had a "disaster" and the first question was "Where is the DR plan?". The response from the sysadmin - "On my desk". Consequently we had to "wing it". Fortunately the people who ran the systems on a day to day basis were available otherwise we would have been in deep SH*T. After that we made sure that the DR plans for all systems were available in both the primary and backup centre (and when it became more available - on-line) and that there were staff on site all the time in case access to the plans was required.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019