Amazon is digging deeper into the enterprise with a data back-up and archival service designed to help kill off tape. The cloud provider has just launched Glacier, which it says takes the headache out of digital archiving and delivers “extremely low” cost storage. Glacier has been built on the Amazon storage, management and …

COMMENTS

House rules Send corrections

This topic is closed for new posts.

Tuesday 21st August 2012 12:21 GMT alikhajeh

It's cheap as chips

We just ran a quick cost forecast in PlanForCloud and it's interesting: If you start with 100GB then add 10GB/month, it would cost $102.60 after 3 years on AWS Glacier vs $1,282.50 on AWS S3!

1 0
1. Tuesday 21st August 2012 15:40 GMT Skoorb
  
  Re: It's cheap as chips
  
  Well, it's sort of cheap.
  
  I've just had a good look around their site (and the AWS blog) and have found out a few things.
  
  First, the data is stored redundantly (specifically can cope with failure of two stores simultaneously), and you can choose if you want it in the US, EU (Ireland, 10% more expensive) or APEC (Singapore, 12% more than the US).
  
  You store data in 'archives'. Once you have uploaded an archive, you cannot change it (though you can add to it and delete the whole thing), you are charged for three months of storage as a minimum, and if you want to download it, you have to get the whole thing. So make sure you split your data up - each archive needs to be a file!
  
  After requesting an 'archive' for download, you have to wait 3-5 hours before you can start to download it. You then have 24 hours to get it.
  
  You need to know what you have stored. A list of the description (if you provide one), creation date and size of each archive is available, but is only updated once per day; if you need any more info you have to download the thing.
  
  You can only download 5% of your stored data per month *pro rated daily* for free. After that, prices go up very fast! As an example, if you stored 1TB of data, and wanted to get the whole thing you would be charged about $369.80 (excluding taxes). (again, 10% more for EU, 12% more for APEC).
  
  So, only good for archiving if you are pretty sure you're not going to want to get most of it back.
  
  Working for the download charge:
  
  Peak hourly retrieval for the month = 36 gigabyte per hour (80Mbps)
  
  Billable peak hourly retrieval = Peak hourly retrieval (36) - Free retrieval hourly allowance (1.7GB) = 34.29
  
  Retrieval fee = Billable peak hourly retrieval (34.29) x Hours in the month (720) x retrieval price ($0.01) = $246.92
  
  Then you add the data download fee at $0.120 per GB. So 1024* 0.12 = $122.88. 122.88+246.92 = $369.8
  
  2 0
  1. Tuesday 21st August 2012 15:57 GMT Skoorb
    
    Re: It's cheap as chips
    
    Oh, and as well as the $369.80 fee for a 1TB download at 80Mbps, it's probably good to know that you can't assign file names to archives (Object Keys in AWS speak). So have fun with that one when it comes to download.
    
    0 0
  2. Tuesday 21st August 2012 17:13 GMT Anonymous Coward
    
    Re: It's cheap as chips
    
    I may be wrong, but it would seem like you could use the Amazon Import/Export to get your full TB (or multiple TBs) back for much cheaper than transferring the whole thing over the internet. As long as you didn't need the data *quickly* that would make sense.
    
    0 0
  3. Wednesday 22nd August 2012 13:25 GMT Ken 16
    
    Re: It's cheap as chips
    
    "After requesting an 'archive' for download, you have to wait 3-5 hours before you can start to download it. You then have 24 hours to get it."
    
    By my reckoning if peak recovery rate is 36GB/hour you're never going to get that TB back within your download window. Am I missing something?
    
    0 0
    1. Wednesday 22nd August 2012 14:11 GMT Skoorb
      
      Re: It's cheap as chips
      
      That's a very good point. I naively assumed that it meant you had 24 hours to *start* downloading the job, but after having a look at the actual API reference it looks like at some random time after 24 hours it may just reset the TCP connection and return a 404 for any attempts to resume. That's just plain stupid.
      
      Which unfortunately means that it's essentially unusable if the amount of data you store on it is greater than the maximum you can pull down your internet connection in 24 hours. That is unless you fancy doing a lot of maths to request multiple jobs about 12 hours apart and you can guarantee that you can maintain a constant download rate over the whole period.
      
      0 0
      1. Thursday 23rd August 2012 10:48 GMT Ken 16
        
        Re: It's cheap as chips
        
        either way, you're not going to get it back within 2 hours as you would with a tape drive (uncompressed)
        
        1 0
Tuesday 21st August 2012 12:24 GMT GrumpyJoe

I tried S3...

it was glacially slow - was it just me? I've heard of others with the same kind of problem - and I was using the EU node for my instance.

What kind of transfer rates are we talking here? If they've fixed that it may be just what I was looking for for my Synology NAS cloud backup.

1 0
1. Tuesday 21st August 2012 12:38 GMT Code Monkey
  
  Re: I tried S3...
  
  I guess they've managed expectations well with the name Glacier. Think how angry you'd have been had it been called "TurboNutterFastBackup"
  
  1 0
Tuesday 21st August 2012 13:27 GMT 0laf

Outlook Cloudy

Just remember when your CEO/CIO/CFO comes into your face waving the savings that Amazon has promised him to tell them that the big fat pipe you'll need to use this doesn't come free or the redundant one you might want to back it up.

Then you might want to look at the Article 29 working group report into the Cloud .

http://ec.europa.eu/justice/data-protection/article-29/documentation/opinion-recommendation/files/2012/wp196_en.pdf

Then you might want to order some more disks for your SAN

0 1
Tuesday 21st August 2012 14:17 GMT Yet Another Anonymous coward

tasked with regulatory compliance

And their SLAs guarantee that the data on this life insurance policy or land deed will be available in 99years time?

That all my data won't dissapear if the US suspects that somebody on Amazon is hosting a pirate movie?

And there is no price rise when I suddenly want to move all my data off their platform to a competitor?

0 0
Tuesday 21st August 2012 14:57 GMT James 100

Very slow retrieval

At first, I thought this was a slower, low-cost variant of S3: same concept, but bigger, cheaper SATA disks and more use of RAID than straight duplication. The multi-hour retrieval times quoted would be consistent with tape, but they denied in interviews that it's tape based - some kind of disk library, perhaps, where the disk is stored powered down in a vault somewhere, then spun up when you request your data back? That could explain a few hours - spin up and mount a RAID set, then copy the data off to a staging S3 bucket for you to read from. Throw in some smart placement (keep all your stuff together, destaging it from S3 in big batches) and they should avoid the worst case scenarios (lots of little requests for different archived objects, spread out in time.

A dozen 4Tb or two dozen 2Tb drives in a pod, with double or triple parity protection, would fit with their 40 Tb maximum object size plus a bit of overhead - and they've set up infrastructure for hooking up big external drives to S3 already for the Import/Export stuff.

I like the price compared to S3 - but it's $120/yr for a terabyte. Probably about what you'd expect to pay to rent a pair of 1 Tb SATA drives for the year, sitting in quiet corners of two different Amazon sheds, plus a small share of a couple of shared drives for parity protection?

0 0
1. Tuesday 21st August 2012 15:00 GMT Androgynous Cupboard
  
  Re: Very slow retrieval
  
  Wouldn't be surprised if the disks are online alreayd, and the multi-hour retrieval is artificially added to stop people dumping the much-more-expensive S3...
  
  1 0
  1. Tuesday 21st August 2012 18:43 GMT Yet Another Anonymous coward
    
    Re: Very slow retrieval
    
    You might spin them down to reduce cooling costs. But you would need customers with a LOT of data, otherwise you would be constantly spinning up a 3Tb disc because one of the 3000 customers with 1Gb on it wanted a file.
    
    0 0