It does not compute; EMC resold $25m of tape products from Quantum in 2010 but has just declared "Tape sucks" to thousands of attendees at EMC World in Las Vegas. This isn't channel conflict; it looks like out and out war. Or not - no one on the Quantum booth at the shindig seemed at all offended by EMC's anti-tape keynote blast …
...another disk vendor telling us that tape's dead. That it sucks, and we should get rid of it. Aside from the fact that it, you know, actually WORKS. I'm hanging on to my LTO4 REAL tape library, thanks, all the blandishments of virtual tape library shills notwithstanding.
Message to EMC
When you start selling 1.5TB disks for backup at £15 a pop, I'll listen.
(However if that takes 2 years to happen, the price I'll pay will go down, and the capacity will need to go up)
Oh, I'll also want WORM capable disks that I can add data to, but never overwrite or delete from.
While I agree with your general post - Tape certainly dose not suck in many, many ways...
My company is a heavy user of EMC's Centera CAS, which allows you to write to spinny disks (RAIDed within Nodes, Striped across nodes and replicated between sites) and allows for files to be made impossible to be deleted, but you can also set it to delete files at a given date, or allow files to be deleted as required. It's pretty good for systems where you need to get WORM-written data back quickly, such as Image and Workflow, Telephone voice recording, Email Archiving etc.
You nailed it
"When you start selling 1.5TB disks for backup at £15 a pop, I'll listen."
This is why they want to kill tape--why sell you a relatively cheap tape when they can sell you a disk for 100 times as much (even when the disk itself should only be a tenth the cost it actually is).
media cost != DR solution cost
You might be able to get 1.5TB tapes for $30, where enterprise 1.5TB Disks cost $150. How much is the fiber connected tape head array you're sticking those tapes in, for an 8 head system with 80-120 tape slots? $60-100,000? How much does it cost to have 32 bays of SATA ports online on a 10G connected storage chassis behind your tape server? About $10K.
Tape may still be the right path for archive, for the small set of data you have to keep more than 3 years after it's backed up, but for short term, near term, and especially the current backup data set, you can re-use that DR HDD 10,000 times, and you can reliably re-use that DR Tape 4 times.
Move your daily backups to D2D chassis, even better still replicated at the block level off site (often cheaper than paying for 4 years of Iron Mountain tape pickup services). Lat that data migrate slowly (using 24 hour windows instead of 6-8) to a low capacity tape system used only for archive, and snap only weekly or monthly archives as required by your regulations because your online D2D chassis already has every version of every file created over a 30-90 day window. Even short term offsite rotations (your daily and weekly sets moving offsite), are cheaper to do on disk than tape due simply to tape replacement costs over time, so even if you can't afford wired data duplication to your failover site or DR storage provider, its STILL cheaper to use D2D for backup than Tape, EXCLUDING long term archive requirements.
As for the benefits of Disk over tape when it actually comes time to recover? How many man hours of IT time are spent every year in your organization recovering files and data that are not in the available onsite tapes from yesterday still cataloged inside the jukebox, meaning they need to be recalled from offsite, re-imported, and wait for an available tape bay then spool the tape to find a file or folder? With Disk, you can restore any file or folder from often several months ago INSTANTLY. How many times is that tape corrupt (per industry averages, 1 in 10 tapes checked has bad sectors or can't be read at all after it has been removed from the site and returned later, if and when the right tape can even be recalled based on accurate data of when a file was even backed up). I've seen cases where it took more than a week to get a vaslid tape from an offsite depot with a valid copy of the server folder needing to be recovered. How many man hours does it even take to rotate all those tapes every day, prepare them for shipping, and more? All that time and potential data loss adds up in real dollars. If you;re not counting manpower, overhead, etc in your costs, you have again failed to properly compare the costs of disk and tape.
I worked for a major DR firm, one that pioneered most of what we use in place today, though you have probably never even heard their name. They did not sell disk or tape, they sold DR software and the server units to make DR happen. In 1998 they had already figured out D2D was better. I did over 400 DR plan designs based around their technology, mostly in competition with Symantec, Veritas, and similar tape products. In every single case we were able to deliver a D2D appliance system at lower cost, hardware, disk, licensing, and rotational media, using their older tape drive as the archive footprint (which works in most cases as what you are required to archive vs back up is usually 25-50% of your total backup size), a complete replacement for their systems, for less TCO. In many cases (almost half) we were able to deliver that system for less than their annual license upgrade price plus a new tape drive, not even considering total competitive system pricing. In every sinlge case, we reduced their backup window by more than half, eliminates issues with multiple concurrent backup conflicts, simplified which systems could back up when, and permitted instant restore of any file for any system. IN about 30% of cases, they also added offsite data replication, which ended up costing less than their pre-existing tape rotation contracts (daily pickups became montly, remaining dollars covered the cost over 4 years of the offsite system installation or hosted partition from a provider).
Disk IS cheaper (for most environments) when you have someone who knows the technology design the data plan. It's faster, its more reliable (offsite rotations can even be RAID sets for added reliability, but disk is in fact more reliable and more durable than tapes to start with), and its easier to meet data compliance and archive requirements with it. It reduces manpower, and it DRAMATICALLY reduces RTO and RPO.
Becuase a cheap tape is used one time for archive, which is nice. However, rotating a tape through daily/weekyl backups will kill a tape in 10-30 jobs, meaning it needs to be replaced. $30 per tape vs $100-150 for the same capacity disk, which lasts thousands of re-writes, means the rotational (not archive, we still use tape for files moved off-site at month end, one tape set only), means the cost of disk is in fact lower than tape even in only 1 year of use, let alone 4-6.
Second, online backups are LOCAL. The only data going offsite is the daily rotation (optionally electronic over a wire, further eliminating man hours and courier costs), and only for compliance reasons. Disk stores weeks if not months of incremental backups in a single disk set, not one backup job per tape catalog, allowing quick and easy recovery of any files, not just last nights. You only need one disk set on-site, and one off-site, not multiple sets in rotation constantly (being replaced every so many rotations as an ongoing cost).
Most of the cost of the tape is in the DRIVE. $4-8K per head unit, plus the chassis and robot mechanisms. An 80 slot jukebox can easily be $60K+ if it has 4-8 drive heads for concurrent backups. A D2D system uses cheap, non-raid (RAID is optional), DASD chassis, costing $1-2K for 16 slots. A typical on-site system comparable to an 80 slot jukebox might have 3-4 drive hot swap shelves, and cost $10-15K (including the high performance eSATA cards for the serevr to support it). It can run as many as 14 or more concurrent jobs, not the 4-8 a tape chassis can handle further simplifying DR planning and backup windows, and there's always a "slot" open to do a data restore without hard reserving an expensive drive to do it.
To do a normal pyramid or similar backup rotation for 60 days, lets say based on 10TB of data, you need 8 weekly master sets, (2 of which are the monthly rotations), and 12 daily incremental tapes. Given 2% daily average incremental change, you need 200GB of daily tape, so just 1 tape a day. using 1.5TB tapes (compressed capacity), you'll get about 1.3TB per tape average, assuming masters are in a single set (it will take more if they're in different tape pools due to wasted excess tape unused). 8 tapes per master X 8 sets = 64 + 12 tapes. Replacing each tape every 15 uses (not including what's archived periodically, that applies to a D2D2T process equally), you're talking roughly 70 new tapes every 120 days at $30/tape for 4 years (plus 10 cleaning tapes in that same period). That's $25,000 in tape costs, plus cleaning tapes. Adds up fast doesn't it. Same cost for local 90 day D2D backups, plus 2 spare sets of disks for offsite rotation, given D2D deduplicates all those bonus master jobs into a single data set, you need about 50TB online capacity and 20TB (compressed) off-line capacity. 4 1.5TB disks per offsite rotation, and 10 more for onsite. A single 16 bay chassis holds all 10 local disks, including RAID 6 and leaves the 4 slots for archive needed. 20 total 1.5TB disks, replaced under warranty free for 5 years, at a cost of about $150 each for enterprise SATA drives. $3,000 total disk cost. Given the chassis is about $4K vs $60K, which method is cheaper? Given a 16bay chassis connects over SCSI or a few bridged eSATA ports, andf that tape drive needs at least 2 Fiber controllers (on each end), connecting that tape chassis is about $4K more expensive as well. Each have the exact same server host requirements and licensing.
#However, rotating a tape through daily/weekyl backups will kill a tape in 10-30 jobs
I obviously have no knowledge of how you are handling your tapes but these numbers are nowhere near my or others experience nor the specifications for LTO. Frankly they seem absurd.
#$30 per tape vs $100-150 for the same capacity disk,
Since this is EMC related I challenge you to get a quote from EMC for mid-range disk(VNX) that is even in the same ballpark as this number. That is SATA in a chassis with the necessary software and support.
Your calculations also assume using a very small tape library(80 slots) where the drive/slot ratio is quite high. For large librarys slot cost is down to maybe one tenth compared to a small library and the number of tape drives needed per slot is many times lower. So your analysis does not hold for any medium to large business where tape really shines. However if you restrict your analysis to small time operations we are in agreement.
At last your tape-only scenario is nowhere near any real world setups. Most of our backup jobs touches disk even if 90% of the capacity is on tape. So those numbers of yours for concurrent backup jobs seems far fetched. I hear no one arguing that one should ONLY use tape.
Lastly I suspect that your design or your choice of backup software restricts you in using tape optimally.
So illustrous thought leader, what do we do?
Here's where I struggle, if you have large data volumes (in the PB range for clarification) and there are low levels of duplication (I know this isn't typical), then what are the alternatives.
You can mitigate it over time with distributed models and the like but I don't see that we're there yet. Until the migration from client / server and monolithic enterprise arrays using RAID back ends we still need backups, they still need to be maintained and sent offsite and at large scale Tape is just cheaper at doing that.
If you have large volumes of data that don't de-dupe well, the best option I ever could come up with in my research was to start replicating with snapshots. Of course, that method isn't perfect either (see http://bit.ly/m5y1Cw). But, at that scale there isn't much alternative. If you are using a more economical scale-out type of storage hardware to get to the PB range, then it may not be completely cost-prohibitive to buy 2 and replicate. All of the tape infrastructure (tapes, librarires, drives, backup s/w library licenses) will be a pretty penny to backup a volume of data in the PB range. Throw in longer-term backup retention into the mix with low-levels of de-dupe, then the tape will be considerably more cost effective from a CAPEX perspective.
When you only have a hammer!!
No surprise margin and control is not great on OEM contracts. So when you only want to sell hammer's every problem looks like a nail.
Oi EMC, no!
DataDomain used the "Tape Sucks! Move On!" slogan for a while, just before EMC bought them out and DD had to ditch the "sandals'n'tofu" attitude to toe the EMC corporate line.
Tape is the ultimate back end since the 50's. Looks set to continue for the foreseeable. Everyone says it sucks, but tape just keeps rollin, crushing all arguments beneath its wheels.
Same old same old ....
I remember an EMC guy telling me in 2000 that tape is dead. And I asked what happens if your data center is destroyed... he paused then he sheaplishly said you need to recover from tape then...
In essence nothing has changed but (take note Jim59), there are other ways of getting offsite redundant copies of your data. Tape is no longer the only option. Co-location, 3rd party cloud storage and even removable disk like RDX. Not saying that any of those are good for every situation but alternatives are opening up. Still , there are situations where tape is the only cost effective option.
Tape is alive and kicking
Tape is definitely not dead. Although depending on what your needs are it value can vary.
I currently work for a small to midsize financial company about 300 people. We are required to keep data for long retention periods. Our policy for monthly backups is to retain it for upwards of 20 years (where that number came from you will have to ask the lawyers).
Honestly, disk is just not an option. We do D2D2T to speed up the backups and reliability on the tape. We have a SAN for weekly and daily backups, and it then get written to tape for long term storage. For long term retention, disk prices start to add up fast. As some one else said, when the cost of a SATA drive for a SAN is equal to the cost of a tape per GB, I may switch my opinion.
At a previous job though I worked with a consultant for very small businesses. We only used tape there for regular backup and then cloud storage for DR purposes.
Anyway your milage may vary, but tape is not going away anytime soon.
I've yet to hear of any large enterprise using anything but tape for ultimate back end. Disk vendors huff and puff, but so far they have proposed no alternative.
What's in a name...
We should believe when he says it sucks... his name's BJ after all.
Bandwidth for offsite is the issue...
For offsite storage, and given a relatively small backup set of say 15TB weekly:
Backup to tape: 36 hours (min verify) - 4 days (with full verify)
10 Tapes, to an offisite location down the M1: 4 hours
Total: 2-4 days
15TB over an OC3: 9 days (give or take)
Now thats not assuming compression and the like but... seems like tape is still winning to me. Plus I've always run into issues doing a verify cycle over WAN connections.
You are not moving 15TB of new files over the wire daily or even weekly. You are moving only the new vlocks in changed files plus new files over the wire each day (as a continual background process.
With a 15TB dataset, and considering likely 10-12TB or less of that is valuable "do replicate" material required offsite, and an average 1-2% daily FILE level change, your delta for block replication will likely be about 0.5-1% of 15TB, or about 150GB. This is HEAVILY compressed using the best methods to streamline transmission (CPU is cheap compared to bandwidth, especially considering hardware compression cards that do it very fast, real time often), so likely you're talking about a payload over the wire of less than 50GB a day.
Your RPO is 24 hours, not 4, unless you;re already considdering onsite backup storage, so getting yesterdays data offsite is a 24 hour window. If you;re backing up DIRECT to offsite systems, you did it wrong. You back up local and (selectively) sync (some of) that data to an offsite repository at the block level. This does not even account for deduplication of that data in flight.
I have deployed D2D2D multi-point DR systems for 10TB data sets and had successful replication over as little as a T1, if not two of them. Ive done 300TB data sets over OC12 easily, with headroom to spare. BTW, OC3 data transfer speed is 1.5TB per day, about 10x what you need to keep 15TB of data synced. (we recommend double the "need" to handle odd load sizes from bust days).
Now that's something I'd like to see. The data density of a DVD or blu-ray, on a strip of film a couple of miles long that isn't magnetically sensitive any more. Oh yes please.