Handbags at dawn please
Go fight it out and no gouging please
Much ado about nothing if you ask me.
must be a slow news day.
Last week, a chap named Mario Karpinnen took to Medium with a tale of how downloading 60GB of data from Amazon Web Services' archive-grade Glacier service cost him a whopping US$158. Karpinnen went into the fine print of Glacier pricing and found that the service takes your peak download rate, multiplies the number of …
Customer puts a bunch of data in glaciar. Restores more than 5% of it. Then reads it's pricing mechanism?
And here I was thinking this cloud stuff was all just free!
It's not even small print. It's disclosed on the pricing page with examples in the FAQ: https://aws.amazon.com/glacier/faqs/#How_much_data_can_I_retrieve_for_free
Part of the issue is that a large percentage of techies don't know the difference between a backup and an archive. The comment above using the word "restore" is a good example. Glacier is not designed to be restored from, it's designed to be read from extremely rarely. This is long term resilient storage for things you probably won't need to read again.
Backup is extremely short term storage for putting things right after a mishap and 30 days is plenty long enough for almost everyone. Read your compliance documents, almost none of them tell you to keep a backup for long periods. Most of them tell you to keep certain data for long periods, and the implication is that that archive is in last nights backup or is on resilient storage and not being modified (tapes, for instance).
If there is a genuine need to read back a large portion of archive data then these costs are probably line of business costs and can easily be justified.
In fact there are 3 levels, not two.
1. Backup for PEBKAC - stuff from which you can recover files lost today. Or yesterday. Or anythime this week. More - sorry, read the manpage of luser, pay attention to -g2 option and go to step 2. Similarly, if you need to use point 3 as backup, read the description of -g3 option in the same manpage.
2. Backup for disaster recovery - something that guards against major incidents. You read it regularly in mock recovery exercises to ensure it is viable.
3. Archive (if you have reached reading from it for anything but "go to court", you are doing something wrong).
Glacier is 3. You still need to sort out 2 and 1.
"Backup for disaster recovery - something that guards against major incidents. You read it regularly in mock recovery exercises to ensure it is viable."
Backup isn't DR either, although the same tapes used to be used for this purpose as a separate use to backup. Almost no people use backups for DR purposes these days as it's too slow, too unreliable, and too hard to test, and almost never works as intended. This is why backup in primary storage is now acceptable and desirable - you have a complete replica off-site on your DR system anyway and for backup purposes you're just restoring data so off-site is a bad thing. If you were designing for DR and you were considering cloud then you'd use a cloud DR solution like the one MS offers. You certainly wouldn't put anything near Glacier.
Classic wrong tool wrong job, I guess we all know that archive is for such things as years back tax returns, copies of almost forgotten insurance records, etc. but sadly the world does not really understand that tools are designed to do specific jobs. Some simple rules, chisels make lousy screwdrivers, hacksaws are not good tin openers, (TNT is worse) hammers are not good for pushing glass panels into place, push bikes are not good for heavy goods transport and so on. At a push you 'might' mow the lawn with wail scissors - and complain that it grows faster than you can cut the stuff. Oh on a communications theme, remember that depleted uranium plates make a terrible substitute for airmail paper for too many reasons.
It might be simpler to suggest RTFM.
"/dev/null is cheaper."
As Voland's appendage notes, that's not true if you later need to read the data *back* from /dev/null in court.
(Or maybe it is. Presumably the courts have already seen cases where someone made efforts to keep the records that they are obliged to keep, but lost them in some disaster. There must then have been some decision by the court about how much effort is reasonable and how bad a disaster has to be for it to truly be beyond someone's control. So if you spew all your "legal /dev/null" data in the direction of a cheap cloud provider who then goes titsup, have you met your legal obligations? If you use two cheap cloud providers, is *that* reasonable? What if one then buys the other and then itself goes titsup? None of this is IT-specific, of course. We've had storage providers for paper documents since whenever and all the same scenarios apply.)
It's simple, just take your CPU count, correlate with the relevant pack for the product, version and subversion, multiply the number sausages that can be powered at any one time (but divide by 0.75 if on Windows), minus the inverse square root of god's dad's boss's dog's inside leg measurement. Then write the number down on a wooden broom handle and shove it where the sun doesn't shine. Sideways for maximum effect.
The downside of "the cloud".
64Gb SD cards (big enough to take his entire CD collection) are about £20 today - granted that cards were smaller and prices higher in 2012, but it's still a one time purchase. Get 3 or 4 of those, which will allow for multiple offsite copies in different locations including a bank safe deposit box (probably the safest physical storage available). Keep one at home for retrievals (storage cost = zero, data volume = unlimited).
"buyer beware applies every bit as much as it does for any other product or service"
In other news, ursine lavatorial arrangements include arboreal facilities, the Pope expresses a leaning towards one particular unproven 'truth' over others, and Uncle Joe wasn't a nice chap who always did the decent thing.
I have never run into a cloud user from a business that didn't have some degree of surprise - and sometimes shock - at their cloud bill. Public cloud is a broad set of technology tools and their use and the pricing that goes along with said use needs to be scrutinized, projected and anticipated in order to lessen the surprise that is sometimes shock. Public cloud can be awesomely valuable in the right use cases, but, just like anything else in IT, it requires governance, lest it get significantly unruly and pricey.
This story is about chump change. I have heard one about a company that "went cloud" and had a run rate of over $500 K per month on their main subscription and an unknown amount from shadow IT that got the part about using cloud but didn't fully read the memo which told them to use the main subscription. Cloud governance and rogue subscription harvesting events at corporate tech rodeos are the next big cloud things.
Biting the hand that feeds IT © 1998–2019