back to article How your data loss prevention plan is killing your business

Are enterprises asking the right questions about data and its storage? This may seem an unusual question – it should not be treated as an abstract issue. If they are not, or if appropriate questions produce unexpected answers, then there may exist distinct possibilities for substantial future savings as well as releasing IT …

COMMENTS

This topic is closed for new posts.
  1. jonathanb Silver badge

    It is not just about numbers here

    Taking your example, what use is a 3 month old copy of the database?

    If you lose your database due to hardware failure or environmental problem (eg fire, flood, theft), you want to restore to the most recent copy of the data, and as quickly as possible. Ideally you would have a real-time offsite mirror of the system that takes over immediately.

    If you lose data due to software issues such as data corruption, a failed update or security issues, then you want to roll back to the most recent copy before that problem arose, and hopefully it isn't going to take you three months to notice there is a problem.

    The system described in this article doesn't have that many recent or real-time copies of the data, so it isn't actually very good, but you have lots of old copies that are pretty much useless other than as poor substitutes for newer versions.

    1. Duncan Macdonald

      Re: It is not just about numbers here

      Old backups can be vital. A coding or user error that corrupts or deletes some of the data may not be noticed for quite some time - it might only be noticed when a year end routine was run. Being able to retrieve (with effort) the missing data can outweigh the costs of the backup regime.

      When I was a system administrator, I tended to keep additional backups outside the normal cycle. One time a private 4 year old tape backup had the last remaing copy of a vital piece of source code.

      Too many backups is expensive - too few is courting disaster.

      If a computer system is being removed - always get a full backup before it goes - if you do not then you WILL regret it.

      1. Alan Brown Silver badge

        Re: It is not just about numbers here

        "Old backups can be vital."

        AMEN

        Here's an example: In 1999 a NEAX61M switch was rebooted to allow y2k updates to take effect.

        It failed at reboot, and the failure cascaded to 3 other switches, putting 200,000 phone lines out of action as well as a major international switch, resulting in subscribers throughout the country being unable to make long distance and international calls.

        BACKUPS WERE CORRUPT. It turned out that something had scribbled over memory more than a year previously and what was backed up was copletely corrupt information.

        Techs had to go back more than 2 years before they found a working backup - that was ok, the switch booted up - BUT 2 years means a LOT of changes around (people move houeses, etc etc etc), so the next step was to replay all database changes - which thankfully had been backed up separately.

        It took more than a DAY before anyone could make any phone calls at all in Palmerston North - but there were still thousands of people who found they couldn't make calls.

        It took 6 _WEEKS_ to replay those database updates. In that period various people had no dial tone, the wrong number, etc etc etc. A small ISP had more than half its lines out of action for most of that period.

        Lesson1: Older backups can be vital

        Lesson2: A backup is no fucking good if it hasn't been tested.

        240 copies of small pieces of data such as mail folders is immaterial. 240 copies of a 1Tb filesystem is another matter.

        I admin backups for about 1Pb of data. It's all about mintaining a balance between COSTS (and frankly tapes are the cheapest part of the whole shebang, so I don't really care if I use a bunch of extra LTO5s at 14 quid a pop), resiliance,complying with data retention laws, keeping archival copes and being able to run the backups in a reasonable time window. (Some areas are backed up 7-10 times per day, others are only touched once a week.)

        A good backup system has a backing database so an admin can zero in on a given file at a given point in time in 2-3 minutes. That database is also a BRILLIANT intrusion/modification detection system - if any aspect of a file changes, its SHA512 signature changes and that means it gets backed up again.

        It can tell you how many copies of any given file are backed up.

        It also doesn't miss things if a file tree is moved - the number of backup systems which will detect and handle this correctly can be counted on one hand - 2 are free and the other 2 cost in excess of £30k.

        The vast majority of "backup" systems out there are crap - and the ones being most heavily promoted commercially certainly fall into that camp.

        1. William Phelps

          Re: It is not just about numbers here

          And, what are those two free packages which work correctly?

        2. WinHatter
          Pint

          Palmerston North Re: It is not just about numbers here

          Woahhh you had to go back 2 years ... a backup in good order would have been much more adequate.

          1) Your company backed up corrupt data ... that can happen but going 2 years without checking its consistency ... that is bad. But at least you have learnt your "Lesson 2" which makes Lesson1 a moot point.

          2) Before doing something very sensitive make sure you have the proper procedure and equipment at hand. Usually in HA things come in twos ... at the very least .. so test things before going commando on the production line.

          I would add

          3) Old backup may be completely useless when they have not been migrated. I've lived such an example where the servers went from x86 to Sparc and the application from C++ to Java ... good luck with finding a 20 year old server in working condition ... may prove extremely costly when data retentive lawyers kick in.

          On that point lucky for you the backups agreed with the current firmware/hardware on the switch.

          1. Anonymous Coward
            Anonymous Coward

            Re: Palmerston North It is not just about numbers here

            "good luck with finding a 20 year old server in working condition "

            Pretty much any past or present VMS shop ought to be able to do that (maybe real hardware, maybe emulated), but if the IT depratment has "standardised" on Wintel then yes it would be a challenge.

    2. mi1400

      Re: It is not just about numbers here

      1 year old backup is sufficient ... beyond that has it benifits but so has being paranoid its benifits... Google, MS, HP etc all declair their Profit/Loss/Revenues for that year and move on... so business criticallity is like 1.1 years...

  2. Blofeld's Cat

    Hmm...

    Perhaps the first question that should be asked is: "Do we need to store this data at all?

  3. Anonymous Coward
    Anonymous Coward

    Taking your example, what use is a 3 month old copy of the database?

    If it's the only good copy you have of your companies critical data, it can be very useful. More useful than having nothing.

    1. Anonymous Coward
      Anonymous Coward

      So the usage case is: "this old data may be valuable if we screwed up and all our more recent backups were corrupt". Which in turn implies backups are not being test-restored; and/or the application data has hidden corruption which a cursory test restore does not uncover, and for which the only feasible solution is to pull old data from an old backup rather than regenerate the data from other sources.

      You can argue that simply archiving tons of data is cheaper than spending staff time on testing restores. However, this is a risky strategy, as there's no guarantee that *any* of the backups are usable, or they may be so old as to have no business benefit.

      I guess the ideal strategy would be: (1) do test restores at fixed intervals, with thorough testing of completeness and usability; (2) keep backups since the most recent test restore.

  4. jabuzz

    Answer use better backup software

    If you want to use joke backup software then yes you are going to be storing loads and loads of copies of the data at silly multiples. Back in the real world you could simply select better backup software, specifically TSM and ONLY store a primary and secondary copy of every version of a file and radically reduce the data retention multiple and have "virtual" backups going back to however far you want. Then if the file is a database or similar where the file changes every day, you just dedupe it.

    Whoever wrote this article is either not a storage administrator, or an uninformed twit who should not be let near any storage again.

  5. Number6

    Use of Backups

    Understanding what might be needed and why is also useful.

    Thre's the obvious disaster scenario, total equipment loss at the primary site, in which case you'd want a complete copy of everything from the day before (assuming daily granularity).

    Then there's a partial data loss, where one or more disks fail, but as you can't predict which data might be lost, you'd still like everything from yesterday to be available.

    Then you get onto audit trails and archival storage. I'm sure most IT people want to be elsewhere (except the BOFH) when someone asks if it's possible to recover a file from last week/month because they've only just realised that's when they broke it. Then there's the need to haul up old versions of documents for other reasons - software people are hopefully already using version control and can reconstruct any version of a file, but other parts of the company are more likely to have just overwritten a previous version.

    If it's needed for tax or accounting purposes then it should have been properly archived and so probably isn't taking up 240x the storage.

  6. Anonymous Coward
    Anonymous Coward

    It's not about backups.

    It's about restores.

    How quickly do you want to have your data back so your operations can resume?

    How out of date can you afford your data to be at that point?

    How confident do you want to be that this week's backups are actually usable?

    How confident do you want to be that if "an accident" happens to a particular backup volume (tape, disk, whatever) you will still be able to recover?

    Etc.

    This on the evening that Natwest/RBS card transactions went offline again. Perhaps they'll be testing their restores right now.

    [Archival for legal purposes is a different subject which I happily admit to being ignorant of. Just like RBS/Natwest are ignorant of IT operations in general]

  7. Drummer Boy

    Start with the end in mind

    This is what happens when you treat all data as equal. We run highly transactional DBs where the entire source data on a multi terabyte systems changes every 3 months. Holding db backups beyond that point has no purpose. On top of that we store the source, input, data so can restore most of the db by reloading source data. Any other data (usernames password tables and the like)are backed up and stored under a different retention scheme.

    For me the key is to define your backup regime when creating, or design, the app,or install, with recovery in mind and be ruthless in removing data that has nofurther purpose.

    1. Duncan Macdonald

      Re: Start with the end in mind

      In the UK, tax authorities can demand to see financial records several years old. If your database holds financial records then you might need to keep old copies for audit purposes even if they are of no other use to the business.

      In one organisation that I worked for, one full backup each month was kept forever to provide the permanent audit capability. (This was specified as a requirement by our major customer.)

      1. jonathanb Silver badge

        Re: Start with the end in mind

        Yes, but for that you are probably better keeping a copy of the transaction report or similar in plain text or pdf format. It won't be of any use for restoring the data back to the system, but that is not the purpose of the data, it will be easier to view the data manually, even if you switch to a new computer system in 5 years time that stores things in a completely different format.

  8. thondwe

    Bake it into the OS?

    Maybe it's time to move away from a backup system making copies, and baking backup into the OS? E.g. The O.S. writes all new blocks to clean disk, so nothing is lost as the systems run. So now any point of time is recoverable? Then tools can be used to trim blocks out based on filters - e.g. Everything older than a certain date - or replicate sets of blocks to create images for specific dates to other locations. Sets of blocks can be indexed in order to aid finding data. Automatic thering can move old blocks from fast storage to slower/cloud based storage etc. Factor in depuplication mechanism and each unique block of data could easily be stored very few times depending on your personal level of paranoia?

    (Is this worth a patent or has it already been done?!)

    1. Richard 12 Silver badge

      Yes, Windows XP-E had the Enhanced Write Filter

      This basically gave you manually-triggered points where the filesystem would only note changes at the block level instead of overwriting, so you could roll the entire partition back to any previous restore point.

      Unfortunately this seems to have vanished from Windows 7 Embedded, which is most annoying.

      Windows 7&8 do have the ability to maintain "shadow copies" of files, so you can roll any file back this way (if enabled!)

      More user-friendly I suppose, but not so useful for embedded industrial.

    2. GraemeMRoss
      Coat

      Re: Bake it into the OS?

      Too late .... it is unpatentable because you mentioned it in a public forum without first getting "non-disclosure" waivers from everyone who reads it.

      However this won't stop a large company patenting it and then defending the patent viigorusly. It only seems to be the little guys who can't get patents!

    3. TheDataRecoverer

      Re: Bake it into the OS?

      Sounds like regular snapshots you describe there: take them instantly, replicate them super-efficiently to get an offsite copy, keep a few months or potentially years at low storage cost......

      Simple to restore, whether the data is file or block based, integrate them with apps where appropriate, take instant clones (regardless of capacity - 10TB as easily as 10GB!)

      It absolutely is all about the restore.

      <Disclosure: NetApp snapshot lover!>

  9. Anonymous Coward
    Anonymous Coward

    De-duping the data within

    I am unclear about how de-dupe work. But does it de-dupe purely on the differences in each file and keep a complete backup of files that are changed or only the difference in data within a file (So for instance you can have my_finances.xls multiple times but each file on differing times/dates contains slightly different data. Then de-dupe only keeps a record of those changes rather than the entire file each itself time)? That would make far more sense.

    Anonymous because I don't want to be labelled clueless (even though I am).

    1. Anonymous Coward
      Anonymous Coward

      Re: De-duping the data within

      The only dumb question is the one you didn't ask when you weren't sure of the answer!

      The answer is yes dependent upon the technology in question. Generally source side de-duplication will send only the changed blocks in data objects (files as well as databases) to the backup solution which must have an index to understand how to reconstruct to a point-in-time. should the local index be lost.

      Storage side or backup server side tends to work at a global level, the storage also needs a mechanism to reconstruct a data object (or indeed several). I'd say database but as storage mechanisms generally don't use industry standard databases I'll avoid the DB word to avoid offending DBAs.

This topic is closed for new posts.

Other stories you might like