Imation will soon be making the industry's first terabyte-plus raw capacity tape, in the LTO-5 format, with delivery in early 2010. The Linear Tape Open (LTO) consortium has three technology-providing members: HP, IBM and Quantum. It defines tape formats against which the three members build drives and independent licensed …
"The format is LTO - Linear Tape Open - but if there is only one media manufacturer then the openness of the media vanishes"
Surely it's multiple manufacturers of the drives, and tape that could be manufactured by other media companies if it were profitable for them to do so, that defines Open?
Closed would be a patented tape format which could be manufactured only with the permission of the patent's owner. (Not including an "open" patent where the owner has agreed in perpetuity that anyone will be given permission to manufacture subject to paying the same pre-agreed royalty as anyone else - CDs used to, and for all I know still do, require a royalty payment to Philips).
I don't think...
It'll be the complete end of tape. Even with de-duplication.
With the mainframes you have colossal VTS libraries which do currently have a small disk cache and then a large back end tape cache. Mainframe tape volumes are generally small in size and also yes high in frequency but with a low retention period for most. And if you look at how HSM has evolved on something like IBM's mainframe, the VTS the way it's been designed is very well suited for the OS.
If you move over to open systems and other platforms this then differs. With de-duplication many saves can be made with a good de-dupe rate. But what if you have a library with say 30 LTO4 drives in it and your tape utilisation regarding space on the tape isn't that great? You've got up to 30 hosts writing at 120Mb/s = 3600Mb/s sustained for potentially hours, but yet say the disk array you need for your virtual tape doesn't require the arm/storage ratio because of the size of data required? You'd need hundreds of arms, even though many arrays are looking to use slow-ish SATA compared to something like 15k rpm FC disk. Then pile on replication of your tape/disk cache and you've got a disk array that's going to have a very intense read/write profile, albeit quite a sequential one.
Yes if you exploit good disk technologies you can replicate your storage and back it up completely separate to the host, but not for all operating systems or all environments. And that sort of disk, software and automation is costly to implement.
Also you have to think of large sized backups with a long retention but a long frequency like yearly multiple terabyte backups. You don't want to keep a 10Tb save on a VTS when it's read profile is near zero and the next iteration next year has changed so much that de-duplication hardly claws you back any savings. You want to farm that sort of thing out to tape, preferrably a tape remotely. A 10Tb save could take up 15-18 Tb maybe within a 13 month cycle. Or about 8 tapes with good compaction.
Virtual tape for certain platforms still needs some work regarding the performance for a library serving multiple hosts against the storage/cache it requires to give the thruput when you look at it to replace large heavily used tape libraries, even if de-duplication is not in-line and is an afterwards-esq process.
I will be keeping a keen eye on how vendors are going to deal with tape regarding it's development or if they thing that storage arrays will become so cheap but yet perfom just as well and networking costs to fall that then maybe.. just maybe tape could be redundant.
How do you enforce a WORM version of tape?
As for "CERN's Large Hadron Collider project will use tape to store the vast amounts of data it expects to generate" - I get the impression that by the time they start generating data, SSD storage will have progressed far enough for them to use that instead and teacher's will be teaching tape in "History of Computer Science" classes alongside punched cards :-D
I think you're forgetting that there are other manufacturers who make LTO4, namely Tandberg, TDK and Sony that I know of, so it stands to reason that at least some of these should continue to make LTO5 as well.
Far from dead
I am sick of hearing these tape is dead stories. I have a couple of IBM TS3500 libraries both of which can in their current configuration store around 2PB of data, while consuming less than 1kW of power. You just cannot do that with disks.
Further to that you cannot dedupe any 20TB of my backup to 2TB (we have around 400TB of disk, and over 100TB HSM'ed to tape). Where the heck does this rubbish come from.
How many times!
Disk backup is *not* a replacement for tape, unless you're prepared to replicate your disk arrays four times over and ship an additional set off-site every month marked 'Keep Forever'. I'm sure there are applications where the concept of being able to restore a copy from three days/months/years ago doesn't apply, but there are plenty where it does!
Even if you keep copies of all your old versions within some humungous database (in which case, I envy you your disk budget), the day will come when some catastrophe (maybe physical, but more likely a software/operational cock-up) will wipe everything and you'll be glad that there's a copy of your data with Iron Mountain (other tape storage suppliers are also available :)
Still a use for it
When LTO-3 drives came out, customers were warned about shoe-shine problems as HDs couldn't keep up with the speed of the tape/drives. That was back when LTO-3 was running at a mere 80MB/s. LTO-4 runs at 120MB/s and often came with a fibre channel interface as SCSI couldn't hack the pace. LTO-5 is projected to run at 180MB/s. The only things that will run at that sort of speed are expensive SSDs or expensive drive arrays. Are you really going to buy a shed load of those to back your servers up to ?
Sure, the seek time of tape is much less than a HD, but once it starts reading or writing, it really is fast.
Did someone miss the bit
where it says manufacturers aren't signing up to the new format? When it stops being cost effective to sell drives/tapes then they will stop being made full-stop whether your individual shop uses them or not. THAT was the point of the article.
Re : How Many times
Well to simplify my comment then :
Yes, I think there will be multi-vendor LTO-5 support. Tape formats, the biggest being LTO and 3592 (imho) will be required to support de-staging large backups from disk VTLs to longer term storage
Would also think there'll be LTO-6 as backup volumes really grow linearly or greater with amount of addressible storage on hosts.. and well.. that's not going down.
But you did say Customers prefer the faster backup speed of disk-based backup and also the very much faster restore speed from hard drive arrays.' Well yes, we'd love it if we were only doing one backup/restore at any one given time but then that defies the idea of a library. In reality it doesn't scale as tape libraries do. In fact with the LTO roadmap giving 270mb/s capable drives this only exacerbates the issues with disk based VTL access. Tape is going to be around for a long time I think. SSD is too expensive, DASD is too expensive for the lot. People make bigger spindles but then host-attached disk usage just grows.
Viva la resistance!
"you'll be glad that there's a copy of your data with Iron Mountain (other tape storage suppliers are also available :)"
You ever managed to get anything OUT of IM before? I'm partly convinced their storage sites are just massive furnaces.
Tape is dead??
I've heard this one before. Any big company running mission critical systems has to off-site multi-genaration backups for disaster reasons. Also many companies are subject to compliance regimes that dictate the retention of data for long periods - often years.
Now you can run de-dup system with remote replication to allow for off-siting of your multi-generational backups without (hopefully) using too much disk space. However, de-dup won't work for archive and log type data (such as DB arvchive logs) as every block tends to be unique. Also, there is a penalty in removing all the redundancy in your backup - multiple, full backups to tape gives you truly independent, multi-generational copies. With de-dup you are absolutely relying on the software and hardware not to screw it up. If it does, then it has the potential to render all generatons of your backups unusable.
There's also the environmental issue. 10PB of tapes will only use power when being read/written to. Even if you can get 10:1 reduction in space requirements (and you'll be lucky) then that 1PB of ((RAID-protected) disk space is going to be chewing up perhaps 20kW with the controllers - if you are dealing with the off-siting issue through de-dup replication, then double that. MAID setups don't work too well as, with de-dup, you tend to be splattering data all over the array and can't really shut any of it down.
Now none of this is to decry using disks where it suits. If you have a small enough business, then a couple of USB-connected portable disk drives might do.
Not this argument again
Tape usage may be changing, but tape is not dead because:
Disk is slower, yes, slower than tape for backup, which is a surprise to many. Network is usually the bottleneck, however.
Disk cannot be sent offsite as many have commented.
Disk is more expensive to buy, run and store, and environmentally damaging.
Disks are a poor and expensive way to store infinitely retained data, which composes a fair proportion of most backup landscapes.
File restore times are not a priority in most enterprises. Backups are optimized for backup speed because restores are comparitavely rare.
As Steve Jones noted, dedup has no redundancy. There are many logical copies of your data but only one physical copy. A serious RIAD failure could put you out of business for good.
Replicated dedup can give wide area deduplication and may be the best solution in some cases where the above does not apply, but is still complex and inflexible.
Tape is dead, no really...
I can see tape being dead in small companies, it's dead at my house for instance as I use disk to backup and a removeable drive that lives at work and gets replicated once a week. This is for a TB of data.
At work we've got multiple L5500s and multiple TS3500s (very expanded) in many many sites. We have hundreds if not thousands of PB of data stored in these libraries. To replicate this with disk will require a massive investment in cooling and power, even if the drives intelligengly shut themselfs down when not in use.
As the datacentre is driven to reduce power useage, I would expect tape to have a resurgence, rather than die. The only question is - when does a cartrigde get too large? For those who already dupe their backups for offsiteing to somewhere like Iron Mountain this isn't an issue, but for big shops where offsites are in a tape library at another site, I wouldn't store 1.6TB of data on a single disk so I would expect technologies like inline duplication (backing up the same stream to two tape drives at the same time) to come into common useage as a type of RAIDesque mirroring for tape.
Was just researching tape drives myself for my company and came across this from HP, maybe helps answer the WORM question..
The LTO-4 Ultrium and LTO-3 Ultrium tape drives features the ability to archive and store data in a non-rewriteable format that meets the most stringent regulatory guidelines. Using a combination of integrated fail-safe features in the drive firmware, cartridge memory, and tape formatting, the tape drives can archive large amounts of data for periods of up to 30 years in a secure, untampered state. Since all LTO-4 Ultrium and LTO-3 Ultrium tape drives include support for both rewritable and WORM media, IT organizations can now easily integrate a secure, long-term archiving solution into their current data protection strategy. As compared to other technologies that feature support for WORM storage, the tape drives offer the advantages of[..]
At least one major use for this not mentioned
Taping all those Sky Digital programmes while your running the brute force hardware decryption.on their 2048 bit keys.
Not that I know anything abou this.
DeDupe and Compliance
Anyone care to guess what is going to happen when the first case of compliance data being recovered from a deduped file hits the courts...so far the courts haven't ruled on regulatory data being rebuilt from a deduped file...but I can see the lawyer saying 'Is this an exact copy of the file that was given to you'. Tape dead? I don't think so.