Tape's greatest strengths are its low cost/GB, low power cost when offline, endurance and capacity. Its greatest weaknesses are its latency, the time taken to locate and mount a tape in a library and the time taken to locate the data you want on it. This combination of mount latency and streaming latency is like a red rag to the …
OK, I'm not remotely an expert on this, but I understood the point of El Reg's last "Tape ain't dead" piece to be not merely that it's cheap now but that it has loads of technological headroom for increased cheapness, whilst disk is approaching its limit. Today's "Tape is dead" piece seems to be comparing the two based only on current prices. Did I miss something?
Tape is dead.
You're starting to sound like my boss.
He's wrong BTW.
"You would need to be certain that data longevity and integrity on disk was the same or better than disk. And if it was then the conclusion would be inescapable:"
I'd be willing to bet that they're about the same.
I'm not convinced. Spun down disks can be pretty bad at not coming back online, simply due to the standard issue of moving parts breaking. You can probably design a more reliable drive with a design aim of stopping/starting many times, but that would probably increase cost and make it less attractive than tape.
RAID or equivalent could make the drive failures less of an issue, but if this is your only copy of the data, you'd want to be very sure you can cope with multiple drives not coming back after being spun down.
It'll likely take someone doing full testing on it and likely some real-world experience of such a solution, but the question is who is willing to trust their data to such a solution?
@John: Read against the sentence quoted. It amounts to "disk [is] same or better better than disk".
But that sentence is indeed the elephant in the room of the article. Even assuming prices of disk storage ever gets as cheap as tape storage, they have a long way to go before they are as reliable in the long-term.
Disk is orders of magnitude LESS reliable than tape....
*IF* "data longevity and integrity on disk" is the same as tape... if it were, which it isn't. Tape media has a lifespan of up to 30 years (and as proven empirically well beyond that), and the integrity of data stored on tape is orders of magnitude better than that of disk. That translates directly into hours of operation before an error-- assuming storage per 100 devices,
* enterprise SAS & FC disk fails in 6.7 days
* enterprise tape (TS1140 technology, for example) fails in 15 years.
Disk has a way to go before it can come close to tape's reliability. And longevity.
And I thought...
That the zuckerburgs etc were looking at "poor quality" flash as a solution to this issue.
It's like harping on about audio tape when everyone has solid state MP3 players. Typically you would keep two copies of all data on separate tapes or separate disks and tapes will fail in just the same way that drives may fail to spin up but there is no reason why you could not have a hard drive 'library' in the same way you have a tape library and auto-load the hard drives as needed (so would not need huge numbers of SAS/SATA ports on the host).
Disks have more moving parts than tape and tape has no electronics at all (with one or two exceptions), so tapes should be more durable. True you do get tape failures, with the tape sometimes breaking, but as you say, you have at least two copies, as you would know if you worked with large backup/archive/DR systems, as some of us do. Tape is also more portable, is probably more resilient to data during transport. Removable disk, whilst possible, is probably more complex to manage. There have been attempts at disk libraries in the past, and I believe that the encasing of the disks and the connectors proved problematic.
Don't get me wrong, there is a place for data replication on disk, and systems like HPSS and TSM can use both disk and tape in a hierarchy to achieve high capacity as well as good speed for recently copied data. But I just don't see any serious manager responsible for large amounts of data drinking the disk cool-aid and ditching tape any time soon.
Heads vs. platters
I wonder if it would be feasible to separate the disk heads and platters. This would make the economics more similar to tape since the likelihood of platters failing is quite small. But my guess is that it would be way too difficult to keep out the dust.
Connecting many disks
I see the biggest problem as connecting the thousands of drives you need to be able to compete with large tape libraries.
Even if you use SAS with multiple levels of expander, the limiting factor will be the number of SAS adapters in your controlling system(s).
With the tape archive and backup systems I have seen and used, it is possible to have multi-petabyte storage libraries with thousands of tapes controlled by a couple of systems (more than one for resilience). I'm currently working on a system with three systems connected to a library that backs up ~10TB of data per night.
I also use a high-density disk farm, with ~4000 disks controlled by 20 systems via SAS, and it is the most troublesome component in the environment, but they are always on and are configured for speed rather than capacity (although they do that as well). I will say that we get more trouble when the disks spin down and back up, so I would worry that you could have problems with not knowing about drive problems until it's too late, even using RAID. You would probably want to spin the disks up on a regular basis to detect which fail, so that you can replace them.
Of course there are trade-offs. The speed at which you can restore the data is dependent on the number of tape drives you can access concurrently, and for disk will be greater for multiple systems each with many disks, but would be more expensive
I wonder if there is any more mileage in Tandon's removable hard drive approach of having all the electronics, motors etc mounted in the computer and the platters, heads & actuator sealed inside removable disk packs. I used one of those beasts in '91, and I found them to be very robust and reliable, and they seemed to be comparable in speed to the hard drives of the time. I think they called them "Data Pacs" or something equally uninspiring.
Re: removable disk packs
Or go back to 1973 -- http://www-03.ibm.com/ibm/history/exhibits/storage/storage_3340.html -- The original Winchester drives. These were also quite reliable and comparable in speed with other drives.
Later, the Winchester name came to be associated with non-removable disks.
Re: removable disk packs
I had no idea the original "Winchester" drives packaged up the heads & platters into the cartridge. Thank you for pointing that out, and fair play IBM. :)
How is shingling going to help?
Yeah, so they can replace 4TB drives with 6TB drives. Big deal. Unless they cost the same (no payback for the development cost of shingling) there won't be that much difference in price. Even if they do cost the same it is only a 33% reduction in cost per GB. Nothing to sneeze at, but doesn't get you to the cost per GB of tape.
Had hard drives maintained the fast pace of capacity increases they were on for about a decade until they stalled out several years ago, I think they would be really squeezing tape about now. I basically had been saying as much for several years. But given that capacity increases have stalled, and shingling is a one time tweak of that can't be squeezed for additional capacity increases, tape is still John Cleese's parrot. The roadmap for capacity increases in tape looks a lot better than the roadmaps I've seen for disk.
The reason why is because SSDs have stolen the entire lucrative high end hard drive market, so it now consists almost exclusively of commodity drives sold based on cost per GB. With much less profit to put towards R&D, the roadmap for hard drives doesn't look all that bright. Who knows, maybe in a decade hard drives will be in danger of disappearing in favor of solid state storage and tape will end up outliving rotating media instead of replacing it?
I'd missed this apparently, so here is a quick description of shingling...
Shingling writes a track on the hard disk so it largely overlaps the "previous" track. This allows a significant capacity gain. As a consequence though, it is essentially designed assuming you start writing at block 0 and write sequentially to the end. If random rewrites are even possible, it'd be by leaving occasional "gaps" where tracks do not overlap, and then having to rewrite the whole group of tracks between gaps -- i.e. very slow. Reads would be no problem.
It would therefore lower the cost of disks that are just having backups written out to them anyway, while having hard disk speed read speeds.
Several good discussion points have been raised in this thread; reliability, latency, longevity etc. all of which are important considerations.
There is one aspect that has not been covered that is equally important; the interaction of people and media. One of the basic principles that make tape cost effective is the ratio of media pieces to read/write stations or tape drives. The higher the ratio the more cost effective tape becomes. However, this is just exploiting one attribute of a dual set of attributes; removability and portability. Removability is good, it enables the overall cost to be lowered; portability is bad because it introduces the protein robot or the protein based automation platform, the human.
When we humans touch things we open the door to errors. These errors are numerous in nature and have the real potential of making the concerns of any one of the technologies referenced in this article as quite trivial in comparison. When the attribute of portability, the removal of media from the automated tape library, is exploited we invite the chance for error because the data/information has been removed from the control of the system; we are entrusting the data to the human process and if history teaches us anything this will lead to serious, costly and inevitable problems. Remember this: http://money.cnn.com/2005/06/06/news/fortune500/security_citigroup/. And it doesn’t just apply to tape; the attribute of portability applies to USB drives as well; http://www.thestar.com/news/gta/2013/10/07/health_information_of_18000_people_stolen_in_peel_region.html. Lastly, in this story you will note that encryption was referenced as a guideline that was not followed in the case of the USB drive. The encryption of data on portable devices does not solve root cause; it may prevent unauthorized access to the data but the data is lost to the entity that put it on the portable media to begin with – fix root cause.
In order for us to manage the Peta/Exa/Zetta bytes of data heading our way we must keep it all under the control of the system or we have no chance of success. Cold storage based on HDD spin down that enables density and lower running cost metrics, offers hope for management under policy-based systems control because it removes the opportunity for the protein robot to inject an error prone process.
"In order for us to manage the Peta/Exa/Zetta bytes of data heading our way we must keep it all under the control of the system or we have no chance of success. Cold storage based on HDD spin down that enables density and lower running cost metrics, offers hope for management under policy-based systems control because it removes the opportunity for the protein robot to inject an error prone process."
I don't want to be a killjoy, but you can accomplish the same goal (of removing people from the equation) by keeping the doors on your tape libraries firmly locked... :)
- iPad? More like iFAD: We reveal why Apple ran off to IBM
- +Analysis Microsoft: We're making ONE TRUE WINDOWS to rule us all
- Climate: 'An excuse for tax hikes', scientists 'don't know what they're talking about'
- Analysis Nadella: Apps must run on ALL WINDOWS – PCs, slabs and mobes
- Apple: We'll unleash OS X Yosemite beta on the MASSES July 24