IT managers use two terms when talking about systems availability. These are: High Availability or “HA”, for keeping systems running without any form of unplanned down time; and Disaster Recovery or “DR”, for ensuring that systems are rapidly returned to operation if they fail. Some confusion has developed between these terms …
Some confusion has developed between these terms over the years.
True, and this article isn't helping.
Traditionally High Availability (HA) has meant the same as it does today, ensuring that a localised fault such as a server or network outage, or a disk crash, is automatically handled in such a way that a service remains available, or is restored very rapidly.
Disaster Recovery (DR), as the name implies, is a solution to a more widespread outage, perhaps a fire or a flood, which takes a whole datacentre off-air. It may not be automated, and recovery times are often somewhat longer, especially when it is integrated as part of a general Business Continuity plan which covers more than just the IT aspects of a disaster.
I would say that there is much more confusion between the terms Fault Tolerant (FT) and Highly Available (HA), which may be what this article is really considering.
HA & DR?
From my perspective, it's online (local memory ... fast ... a workstation), nearline (available over the network, without human intervention ... sometimes slowish ... "the web") & offline (tape that isn't physically available to the Memorex robot and needs human intervention ... "the cloud") ... but then I come from a real hardware background.
In other news ... March of 2008? Is freeformdynamics getting further & further behind? Perhaps they should get their heads out of the Clouds and invest in online storage ...
Well, I quite liked the article, don't care so much about whether it's HA or DR as long as it's UP...
Having been partially DOWN myself recently, I think this is a very important topic.... I especially like the idea of a "business service catalogue" which details costs/benefits of HA solutions, for a particular input.
I also like the idea of a root cause log, not mentioned in the article but implied by the very nice bar chart. I plan on implementing both as database tables in my BIS.
What about FT and BC?
There's also fault tolerance (FT) which is a very high level of redundancy, such that software is unaware of any failure. I'm thinking especially of Tandem. And then there's business continuity (BC) which is like DR but focuses less on the disastrous aspects. I'm thinking near line storage or data replication. The lines can blur but you really need prudent amounts of both. You don't want to fail over to the bunker just because one disk goes bad. And your RAID6 array doesn't help when it's under water.
- Geek's Guide to Britain INSIDE GCHQ: Welcome to Cheltenham's cottage industry
- 'Catastrophic failure' of 3D-printed gun in Oz Police test
- Game Theory Is the next-gen console war already One?
- BBC suspends CTO after it wastes £100m on doomed IT system
- Peak Facebook: British users lose their Liking for Zuck's ad empire