Windows and Mac only ...
Windows 7, 8, Vista, XP, Mac OS Snow Leopard
Who here has a disaster recovery plan? OK, that's most of you. How many of you have a disaster recovery plan that isn't "panic"? That's a more reasonable number. How many of you have disaster recovery plans that you have actually tested? Excellent! This is encouraging. Now how many of you could implement the full plan before …
Windows 7, 8, Vista, XP, Mac OS Snow Leopard
A "disaster" could involve staff, too.
For example, what if the canteen serves a dodgy lunch and all your network admins are off sick for 2 or 3 days?
How about if your star DBA leaves and takes his/hers/its sidekick to the new firm ... and another DBA starts maternity leave ... and the last one, sick of having to do the work of 4 people has a nervous breakdown? You can't train up replacements in the blink of an eye - and training them takes time away from doing the job, itself.
As with most problems that actually bite companies in the arse, it's not the foreseen situations that are the problem: they are the ones that will have contingency plans. It's often the ones we are blind to because they are so familiar that we can't even see them.
Not just a technical problem
Thank you - you just highlighted the problem with about 8 out of 10 Business Continuity Management plans I've had my hands on. There are 3 main reasons for BCM that does not fulfil ISO 25999:
- It's written by IT and thus only deals with IT. No handling of staff, media, buildings, business processes (looong list of "others" omitted).
- It's been written by consultancies at great costs and thus contains 100+ scenarios from which an inexpert operator has to choose in a time of extreme stress
- Nobody has actually ever tried to execute any of the recovery processes for real.
I've had BCM plans in my hand that were in my opinion bordering on criminal neglect, written by Big Name consultancies.
Back to theme: someone smart once said o me that he didn't want backup at all. Not interested. What he wanted was RESTORE, which is actually very correct. You can't have the one without the other, but a backup you cannot restore is pointless, and that restore needs to be tested and well exercised. Until you have a successful restore, you don't actually have a backup IMHO..
>> - It's written by IT and thus only deals with IT. No handling of staff, media, buildings, business processes (looong list of "others" omitted).
Been on the other end of that, as a techie in IT, having management demand that I "write disaster recovery plans". I knew enough to know that I didn't know much, and was lucky enough to get on a good course at a "last minute, got empty seats, and the course works best when full" discount. After the course I now knew that I didn't have half the information needed, and also knew that my chances of getting buy in from management was nil - confirmed by the less than helpful responses to my requests for information such as recovery time objectives.
And I expressed my "concerns" about the situation.
And did I mention that this was to be a zero-budget exercise - so even if I came up with a 100% absolute requirement for something, there was no budget for it.
A few months later as we had a fire alarm in the factory - false alarm, but we were still stood outside in the rain. I don't think it was received too well when I asked of a senior manager "so if this were a real fire, and we weren't going back inside today, how would you propose to get people home (or at least out of the rain) given that their car keys will be in the office ?"
We never did get DR plans. We had plans to cover IT, but no DR plans. It was all forgotten before long - long most "buzzword bingo" requirements thrown at us by insurance inspectors or auditors.
Many backups are either on site or to a near by site. But serious weather problems and electrical blackouts can happen over a larger area so an easy cloud backup to a different geographic area would be good.
Disaster recovery? What do you do if you have outsourced your call center to the Philippines and a hurricane hits? What is the average wait time?
" What do you do if you have outsourced your call center to the Philippines and a hurricane hits? What is the average wait time?"
Change your energy suppy to Npower, and you'll be able to find out for yourself next year.
Great, you've got your DR plan. You've got a second site, and you've even tested several times that you can fail over to it. And it worked. There were some minor grumbles, but you've documented those and will prove the workarounds next time.
BANG! Disaster happens and this shit becomes real. Plan into action, some hard hours put in, but the business is up and running and everybody's happy.
Now, what is your DR plan?
Very few businesses have the free finance to have two DR sites, but your DR plan should include a section on what your next DR plan should you need to invoke this one. Simple high level steps, some contact details for alternative providers, and a summary list of what would be essential. Because you really don't want to be investigating that kind of thing while you're still managing the current disaster.
You could always try asking GHCQ / NSA for their copy :-)
This time last year (specifically, 8 Oct) Trevor was singing the praises of Pano Zero thin clients (or VDI as we are now supposed to call them):
Two weeks later Pano went out of business: "Pano Logic went out of business on 10/23/2012. Propalms, a global provider of application delivery solutions, secure SSL VPN access and virtual desktop infrastructure solutions recently acquired the rights to provide sales and support to Pano Logic's channel partner and customer base."
Does stuff like that need factoring into the DR plan? Was the article's mention of Nirvanix's fate sufficient warning of the impact of this likelihood (it's not just a possibility)?
Pano's stuff eventually got bought up and support was provided to existing clients. I think that's a far more common scenario in the IT industry than simply vanishing into the void. "An established client base" is still a valuable commodity; someone will buy it.
The bigger issue is those transition periods. What do you do between the point where a Pano Logic (or a Nirvanix) goes off the air and the point where the deals are done and their customer base is purchased during the fire sale?
Personally, I advocate the "keeping spares on the shelf" policy as much as possible. (Licenses, physical devices, etc.) How does one do that with Cloud Computing??? The only "keeping spares on the shelf" version of Cloudy computing I can see is Microsoft's CloudOS (on prem, service provider, Azure trifecta making it unlikely you'll lose access to all systems capable of hosting your infrastructure) or VMware (same deal as CloudOS, just less refined.)
Whether or not that constitutes "industry standard best practices" depends entirely on who's paying whom. Cloud providers will say such concerns are unwarranted and unnecessary. "Get of my goddamned lawn" coalface sysadmins like myself generally prefer the paranoia route. Each person will have to decide on their own.