back to article Hold it! Don't back up to a cloud until you've eyed up these figures

Online data vaults are everywhere. On the small storage side, we have options such as Google Drive, Dropbox, and Teamdrive. My Synology NAS, the upcoming 2012 Microsoft Server Suite and any number of virtual appliances can all back up bulk data to the cloud. The software side of things may be settled, but is this all truly …

COMMENTS

This topic is closed for new posts.
  1. Buzzword

    Just out of curiosity, what kind of small businesses generate 1TB or even 15TB of new data each month? Aside from video production work, I can't think of anything that qualifies. An uncompressed high-resolution multi-spectral aerial scan of Alberta's oilfields could get close to those figures, but every month?

    1. Infidellic_

      I agree, particularly given that any decent backup solution will merely be doing differentials

    2. JimmyPage Silver badge

      Doesn't surprise me ...

      Storage is cheap, processing is expensive (and slow). So if you are into *big* data analysis, you'll build massive cubes to cut down on the processing needed for slice-n-dice multi-dimensional reports.

      Company I worked for could easily build a 10Gb series of cubes overnight. Each one is unique, so no differential possible.

      1. petur
        Boffin

        Re: Doesn't surprise me ...

        Indeed. And differential also doesn't work very well with VM images...

        1. Buzzword

          Re: Doesn't surprise me ...

          If I were doing big data processing, I'd be tempted to shift the whole lot to the cloud. Just have a thin local client application, while the cloud machine crunches the numbers. I'd even be tempted to have a hosted Windows 7 desktop, network latency permitting.

          Photographers and video editors are admittedly a special case. Though I didn't expect 60 GB per shoot!

          1. The Wegie

            Re: Doesn't surprise me ...

            Even a home user can generate vast amounts once you start playing around with video. I'm currently taking my recordings of this year's Tour de France and encoding them in mp4 so that my brain-dead Sony Bravia can play them off my NAS and I can have some space back on my DVR. The raw data files run at something like 160 GB!

    3. AllyourComputers

      From a small business who offer's IT support in the UK, I have 2 customer sectors who generate ridiculous amounts of data and make online backup simply not viable.

      1. Professional Photographers - I have one client who recons an average shoot is approx 60Gb of Photographs - and she had 90 appointments in the first 3 months of the year...

      2. Professional Video Producers

      Hard drives continue to grow at an almost alarming rate but yet online storage hasn't kept pace. It is a big problem for certain market sectors.

      1. Trevor_Pott Gold badge

        @allyourcomputers: I ran these numbers only with my photographer clients. The 3d render chaps and the video chaps produce so much it wasn't even worth doing the analysis. I knew the answer before I started.

        Let's not even start with the medical imaging folks, the mass spec labs, the geologists...

    4. Rampant Spaniel

      As a photographer the maths run something like this. Per billed hour shooting:

      200-500 shots depending on the length of the shoot and the type of shoot. A 90 minute wedding will have way more shots per hour than a 4 hour studio shoot. RAW's (leaving film scans aside) run about 30MB each so lets say somewhere around 12GB of raw files.

      Edited shots (lossless compressed tiffs) adds another 5-7 GB.

      JPG deliverables adds another 2GB so near enough 20GB per hour shooting.

      Working on 4 hours shot per day, 5 days per week (on average over the year), its 80GB a day, 400GB a week and something like 1.7TB a month.

      I tried 'cloud backup' years ago and when my primary drive died they said I couldn't restore over the net and they had to send me a $300 dvd or something like that. I just used my own slightly older local backup.

      These days I buy external drives from costco. I get them in sets of 3. They are all duplicated across each other, one stays at home, one goes to a storage unit (climate controlled, every 6 months or so I power up the drives and check theyre ok) and one goes to a family members house. The cost of 3 drives is around 400-450 dollars. USB 3 is quick, even driving to get a backup, its far quicker than downloading. All jpgs also get uploaded to smugmug. Chances are if something happens bad enough to kill all 3 backups and the smugmug account and my local raid array, I will have other things to worry about.

      God only knows what videographers need storage wise. I know a lot of the locals here run between 50 and 250mbps using at least 2 rigs, they must chunk some serious storage!

    5. Elmo Fudd

      Contractors working with oilfields seismic data easily generate these volumes of data.

    6. Trevor_Pott Gold badge

      Photographers. Arial, school/sport and other high-volume photograpgers produce that no problem. That's one set of clients. The other one is the business that prints/mounts/etc those photos for most of western Canada.

      There are many pictures, and they get larger and larger every year.

  2. McVirtual
    Meh

    Hmmmm

    'Some' good points here. This is why it is important to understand your Cloud Service Providers (CSP), SafeHarbour and Euro DP policies, amongst other things.

    However, there are always 3 factors in a decision making process of this type. Cost, Risk & Service.

    This article addresses some of the raw costs here, but as your bag of salt statement impies, there are many other intangible costs here, and other non-functional costs that need to be included.

    Needless to say, these are all re-baselined again when we look in the corporate space and begin to leverage other IT commodities and bandwidths, etc.

  3. Pete 2 Silver badge

    Backup's not the problem

    Restoring it all is the problem.

    Sure, for a home user the "A" part of ADSL means you can (in theory, at least) pull data back off your cloudy storage faster than you can push it up there. But try restoring a worst case, of a whole 1TB data set in one go and see how far you get. Even with a 50Mbit/s fibre connection you're talking 2½ DAYS to restore, assuming you can get full-speed for all the time (and don't run into data caps). If you're using flaky backup/restore software, you could find that a break in the connection means you have to start again.

    So, the best you can hope for, if you're running a business is that it'll be half a working week before you can get 1TB of stuff restored. How does that fit into your DR plan? That's assuming the plan works - and almost NONE of the DR plans I've seen have ever been tested in a "fire practice" situation.

    So far as backups go, store stuff off site - that's just sensible. But remember than no network has the same bandwidth as a van full DVDs.

  4. Anonymous Coward
    Anonymous Coward

    Back up?

    Back up to the cloud? That's the wrong way around. The cloud is what you back up locally. The cloud can go away at the whim of economics or weather, or botched New Zealand arrest warrants.

    It's volatile insecure storage and if you think otherwise you don't know what you're doing.

    1. binsamp

      Re: Back up?

      If you have to back up locally, why bother with the cloud in the first place?

      1. Anonymous Coward
        Anonymous Coward

        Re: Back up?

        Well, the cloud offers some functionality (distributed access, mostly) too. But you have to plan for it not being there in the morning.

    2. PyLETS
      Boffin

      wrong way around

      If the sensible place to keep your data is where you process it and where your users can best access it, then keeping the main copy on a hosted server in a datacentre with professional operators, high speed multiple routed links close to the Internet backbone, secured power supplies and rigorous physical access controls makes more sense than locating it where I'm located. So in my situation the data is processed on the so-called "cloud" and backup occurs using the faster side of my ISP link, i.e. my download bandwidth. Also makes sense to automate it, encrypt it, and only download the differences. Rsync, SSH and Cron are my friends here.

      I guess the exception described in the article is the seemingly legacy business model where most of the access and nearly all of the users are local to the site where you work. I guess that way around still applies in some internal data heavy environments, as opposed to where the bulk of your input and output relates to your external as opposed to internal relationships.

      1. Colin Millar

        Re: wrong way around@PyLETS

        Legacy? I bet that that local access model is more mainstream than your situation.

    3. Anonymous Coward
      Anonymous Coward

      Re: Back up?

      Great comment! Then there's the fact that it's very unlikely that your off-line storage can be hacked or otherwise misused by people you have never heard of....hackers, Feds, etc, etc.

    4. Anonymous Coward
      Holmes

      Re: Back up?

      Probably the best point in the article-plus-comments I've read so far.

      Cloud-stored data is in somebody else's hands. You don't even own a "thing." It doesn't matter how big the name is: whoever thought Woolworth would go broke?

      Anyone who uses cloud for primary data storage needs to fail their own security audit.

  5. b166er

    I saw a briefing once, where a local CDP unit did dedupe and cloud sync to an identical CDP unit.

    In the event of a catastrophe, the firm would send the remote CDP unit (after taking a complete image onto a new CDP unit, and then courier you the CDP unit.

    Can't remember the name of the firm, but I thought at the time it was the best possible solution if you want your data off-site in a cloud.

  6. Anonymous Coward
    Anonymous Coward

    I would question the concept "back up"

    By all means use the cloud as a resource, but it shouldn't be the be-all and end-all of your business continuity/DR planning. Quite aside from the technical risks, you're a hostage to other countries (looking at you PATRIOT act merkins) political machinations.

    One interesting trend, is companies using the cloud to avoid investing in their own network infrastructure. Especially with outfits now offering a private cloud overlaid on the public one.

  7. petur
    Boffin

    Personal Cloud

    A much better solution is to create a personal cloud, easy to do with a couple of NAS boxes. You can but them next to each other for the initial sync, then drive them over to another location and do differential backups to them. When a restore is needed, go fetch the NAS and put it local for speedy restore.

    Another solution (if data is limited) is to use external drives (USB3 or eSata recommended) and have a rotating set of those. Make them encrypted (my QNAP NAS will do that for me) and just store them at a location you visit frequently (home/work is a nice combination). Can't go any cheaper for the reliability given!

    1. AndrueC Silver badge
      Meh

      Re: Personal Cloud

      Isn't that just called a file server cluster?

      Maybe I'm just getting too old for this game but why does a couple of local file servers qualify for the moniker of 'cloud'?

      1. Anonymous Coward
        Anonymous Coward

        Re: Personal Cloud

        Because if you don't rename things that have been around for ages you can't claim them as the next best thing and sell them all over again.

        1. AndrueC Silver badge
          Unhappy

          Re: Personal Cloud

          Sad, but probably true.

    2. Random K
      Stop

      Re: Personal Cloud

      Storing backups at home? Doesn't sound like good access control (in the classic paper accounting sense). What happens when the employee doing this gets canned or the company owner keeping backups is suspected of fraud? A safe deposit box at your local bank branch is a much better place for those external drives, and they're often either free or cheap-as-chips with a business account. That way you have a record of who has accessed the drives that is kept by a disinterested third party. It also comes with the advantage of being able to send a minion to get the drive (then directing the whole thing remotely) if you happen to be out of town when things go titsup. In the SMB space there is usually a good chance that IT is a one man/woman show.

  8. Steady Eddy
    FAIL

    Cloud = clown

    Doesn't matter what your SLA says, if a bloke in a JCB digs up the cable outside your building, you're stuffed.

    And you can jump up and down and escalate it with your Account Manager at Lowest Bidder plc, but the time to rectify is however long it takes for someone's subcontractor's subcontractor to splice the fibre back together.

    1. JimmyPage Silver badge
      Boffin

      where managers earn their money

      circuits and lines are assessed as part of a businesses BCP plans. If you have predicated your business on a single telecoms provider and circuit then it needs to be flagged as an issue and either rectified (get a second supplier and circuitry) or devise a compensating control (which may be to power your entire internet pipe through a 3G dongle). Our BCP has an off site war room setup with a 3rd party (Sungard) where essential staff would be transferred in the event of a building becoming compromised (i.e. no internet access).

      BCP/DR is a serious business - getting it wrong can result in going under.

      1. Anonymous Coward
        Anonymous Coward

        Re: where managers earn their money

        also, don't assume that having two internet connections from two different suppliers means that they wont be placed in the same conduit in the ground and hence both fail in the presence of a rogue digger ( speaking hypothetically, of course )

        1. Anonymous Coward
          Anonymous Coward

          Re: where managers earn their money

          If only that was a hypothetical - seen it happen several times over the years.

          Clue: if you want real diversity you must specify that to your providers and make sure its coming into different sides of the building. It really is a case of if you don't ask, you probably won't get it.

          1. Anonymous Coward
            Anonymous Coward

            but

            it's all very well specifying it to your providers. But what do you do when they ignore you ? I had a SQL cluster fallover once. When I asked our hosting company why the standby machine failed, they replied (unfortunately for them in an email) that it was on the same power bus. This was despite their (written) assurances to me that they split machines over data centres and power grids.

            The first rule of planning for disaster is to distrust everything and one.

            So despite specifying separate circuits, it would be as well to factor in a total loss of connectivity. As I have suggested before, possibly moving buildings, if it's that critical to you.

            Remember, "disaster" can come in many forms. One company I knew lost 2 days, because their head office was sealed off after a murder happened in the park to the side. I am sure their connectivity was 100% available throughout.

          2. Anonymous Custard
            Alien

            Re: where managers earn their money

            Forget hypothetical - look at today's news where a rogue digger in Moscow took out Russian communications to the ISS and their array of satellites.

            OK a slightly different casebook, but similar electon pipelines.

  9. hitmouse

    Consider Australia where all plans have quite low upload/download caps compared to the rest of the world. It would be cheaper to fly overseas and upload a 1TB drive than to pay for it at home.

  10. Scarborough Dave
    Meh

    We just had an adventure on the cloud

    Someone nicked to copper from outside providers site (which was in fact fibre) and we had no contact for 3 days with our cloud services and media.

    Also you have to rely on the provider's availability and recovery plans, which may not be a robust as they say they are as until you see them in action yourself you should not rely on what you are told in some sales man's spiel or an advert.

    Backups we do it by Wi-Fi to a remote location (helps if you know other businesses in the area with similar issues) along with taking physical backup discs home with us (all encrypted etc..).

  11. Anonymous Coward
    Anonymous Coward

    Shaw

    ' This is still being revisited internally, and rumours are that you will be throttled into the ground.'

    You will indeed be throttled to nothingness.

    I'd be more surprised youd actually get this far with Shaw.. I'll have to buy you a pint next time I'm out there.

  12. Anonymous Custard
    Boffin

    As my old syshack used to say...

    Never underestimate the transfer rate of a van full of hard disks.

    1. Oninoshiko
      Thumb Up

      Re: As my old syshack used to say...

      young'un 'eh?

      I always heard it said "never underestimate the bandwidth of a volkswagon full of DATs"

  13. technohead95

    It's worth noting that not everything you back up to the cloud is something you need instant access to. For that there is Amazon's Glacier services which allows you to archive data. You get significantly cheaper costs compared to S3 but lose your instant access. Access is reduced to a few hours (which might be fine for rarely needed data).

  14. Chris Evans
    FAIL

    "business ADSL with 2.5Mbps not hard to find in most developed nations!" 3rd world UK here!

    "business ADSL with 2.5Mbps upstream ....(not hard to find in urban areas in most developed nations) "

    BT can only give me 900K upstream in a large UK town. No BT Infinity here.

    'undeveloped' UK!

  15. Anonymous Coward
    Anonymous Coward

    Law enforcement is a risk for the cloud too.

    What I consider the most compelling reason to make sure what you're doing is that you're using a virtual instance which runs on one or more servers. When the Feds (or any other global police force) suspect foul activity they usually get warrants to inspect, investigate or confiscate an entire server.

    Very nice if that server happens to be something running a dozen virtual clients on top of it and one of them is yours.

  16. Beachrider

    On 1 TB/month...

    I have some questions:

    1) If the cable-link is 'limited' to 1 TB a month and you backup 21 times a month. Then each backup is 'limited' to ~48 GB, no?

    2) If Amazon is limiting you to you-managed twin 500 GB cloud-disks, don't you need to manually groom the disks to retain restore-points from several parts of the month? How does this get done?

    3) Does Amazon provide 'point in time' images of your cloud-disks?

    ...Just because we run into these concepts with internal backup/recovery scenarios...

    1. Trevor_Pott Gold badge

      Re: On 1 TB/month...

      @beachrider any decent "cloud backup" solution backups from your local stuff to a "buffer" appliance, dedupes the ever-living-crap out of it, then fires the blocks up to the cloud. S3-aware setups can just keep spinning up new instances of storage in 500GB increments and filling them with blocks as needed. Amazon's "backup" offering is called Glacier, and is offline tape managed by their robot.

  17. Eddy Ito
    Coat

    "consider that those tubes can be filled and if they are filled when you put your message in, it gets in line and it's going to be delayed by the enormous amount of material clogging that tube."

    So it's mostly like a sewer. I can see why one might be cautious about back ups.

    1. Oninoshiko
      Joke

      Like a sewer

      full of mostly similar content, too!

  18. Anonymous Coward
    Anonymous Coward

    Whoa, "2.5Mb/s upstream"???

    Where I come from a normal ADSL line has a limit of a MAXIMUM of 384Kb/s inherent to the technology.

    1. Trevor_Pott Gold badge

      Re: Whoa, "2.5Mb/s upstream"???

      Technically it's VDSL 2. *shrug* It is marketed as ADSL, as were all the iterations before it. ADSL or Cable. These are your choices.

  19. Infernoz Bronze badge
    Boffin

    Build and use FreeNAS 8.3 boxes, over costly and flawed File Systems in of-the-shelf NAS

    Yes, cloud capacity and ISP capacity are a joke for backup, and I've seen 80Mbit UK fibre regularly get congested; also, I would worry about data corruption too, if I was not sure that the data was stored in a ZFS parity RAID! The worst problem is actually the latency and data transfer time; this will be at least an order of magnitude slower than local or stored backup disks; this is the main fatal flaw for cloud backup and WiFi backup too!

    RAID1 is not good enough, because standard RAID 1 won't protect you from hidden in-line or on disk corruption; see http://en.wikipedia.org/wiki/ZFS

    DVDs are not reliable; I have seen so many unreadable disks and corruption that I stopped burning DVDs years ago, even a cheap USB flash stick and cheap bare USB hard disk are better!

    FreeNAS includes support for:

    * Commodity and some high-end PC hardware

    * 32-bit and 64-bit CPUs

    * trivial installation, given it runs off a small flash stick of at least 2GB.

    * config backup/restore via a web browser

    * OS updates via a web browser, with digest check

    * scheduled snapshots (with configured timeout) for each ZFS Dataset, so that multiple earlier snapshots can be viewed _LIVE_, which is even better than differential backups, especially if the Dataset is used directly for storage.

    * scheduled push or pull replication and rsync, and scheduled backup.

    ZFS v28 include support for:

    * 128bit storage addressing, so only limited by hardware!

    * multiple software RAID models

    * Single, double or triple parity RAID!

    * ZFS Datasets, so none of this stupid fixed size partition nonsense.

    * concurrent transactional filesystem processing, rather than common unsafe logged filesystem processing, so filesystem can never be corrupted.

    * heavy duty data corruption detection and repair

    So you could have a primary FreeNAS box, and keep swapping one or more slave FreeNAS boxes, so you are never without backup.

    See:

    http://www.freenas.org/

    Plenty in features.

    Loads in the manual.

    One happy FreeNAS user :)

  20. katsnelson

    Why not use Amazon Import/Export instead of upload/download?

    Working on Big Data, we routinely have to transfer terabytes in to the cloud and we would not use upload for anything that is over a terabyte. Amazon offers Import/Export service which allows you to send physical media (think cheap SATA drives) and for $80/disk they will import it for you and return the disk back to you. For large volumes of data it is the only way to go.

This topic is closed for new posts.

Other stories you might like