Watching the SpectraLogic announcements from afar and getting involved in a conversation about tape on Twitter has really brought home the ambivalent relationship I have with tape; it is a huge part of my professional life but if it could be expunged from my environment, I’d be more than happy. Ragging on the tape vendors does …
Microsoft's Data Protection Manager combines the speedy access of disk, with the long-term of tape. But the overheads are a nightmare. Reckon on storage x 2, just for backups (depending on retention periods). Also, even though it's quite a simple product, we tend to find the management it takes is pretty high - things fail frequently and are a nightmare to resolve; we've had more calls with Microsoft regarding this product than all others combined.
Why tape is still here
It is because:
- the CIO doesn't have a clue and only wants its butts protected, so tells IT to "keep everything, forever"
- same goes for the business: "keep everything, forever" and I'm protected
- at the end, the whole corporation ladder is on the same page
Then it's all down to the poor storage admin to "keep everything, forever" and do so with a severely limited budget, which all the aforementioned functions superbly ignore its link to their immature, anal requirements.
Then, the poor storage bod has only tape solutions, and yes, the SW sucks, it's slow to restore, and it may fail (STK 9840A/B/C/D I''m looking at you !).
Lesson of the day: if the position is storage admin, make sure you get a *really* good pay check !
Re: Why tape is still here
"the CIO doesn't have a clue and only wants its butts protected, so tells IT to "keep everything, forever""
It's time we started expunging these people through clever use of the DPA. Simply get some details placed in a database, let it enter the "archive" then request it's removed and voila your idiot boss is in the dole queue or prison for crimes against privacy :)
Re: Why tape is still here
"It's time we started expunging these people through clever use of the DPA."
You're forgetting Enron! Remember part of the case against Enron was it's failure to destroy records - obviously without these records the prosecution would of struggled to build a case...
Re: Why tape is still here
The reasons why tape is the win for large data / long term backups are fairly straightforward and have been well discussed on this site. Data size, data laws, money and Moore's law all ensure that tape stays where it is for the foreseeable. Some have argued that spinning disk threatens tape, and some say spinners are themselves will be ousted by SSD. But SSD replacing tape ? Seriously ?
Tape can be very reliable
I grew up with a C64 with a Datasette (tape drive) and the well known 1541 diskdrive. I had plenty of my stuff on tape because there were several friends who didn't have a 1541 just yet, so we also swapped tapes.
A few months ago I set up my C64 again just for nostalgic sake and guess what? Both the tapes and the disks worked like charm. I didn't even clean the heads of both devices.
So yeah, in my opinion there is most definitely a sense of reliability when it comes to magnetic storage.
I agree with "lansalot" in that DPM is a good solution. However, I disagree about the cost and problems. I might be a little biased as we got the software for free as part of our licensing agreement, but after the initial outlay of a storage array and configuration I've found the software a doddle to use and management is minimal - certainly no calls to MS needed for me.
For me you nailed it with "People talk about tape as being the best possible medium for cold storage and that is true – as long as you never want to thaw large quantities quickly. ", Essentially you are saying "have the most appropriate system in place where possible". In larger organisations its probably more difficult but for us with only around 5TB of stuff to archive each night long term its not too costly.
For example we have our backup solution tiered across different storage mediums (and importantly separate from our HA solution which people seem to think is the same thing). So for example every night we make a tape backup of everything which then gets sent to our other office (who periodically test restores I will add). We also keep the same info on disk for 30 days in snaphots locally and lastly we send the data to another location via rsync where it is stored for 1 year on a disk. I have had to restore from each of these and if its old data they are after their need tends to be far less urgent and often less precise.
In the future unless SSDs are going to cost me the same as a similar sized tape tape will live on with us for a while (For reference we use 2 * HP LTO-5 Ultrium RW per night) with compression. Costs us around £45 a night.
Longevity of SSD as a medium
Has anybody published figures about how long an SSD will hold data in a cold state? I get the impression that most rely on the wear leveling capability of the drives to re-compute damaged data from checksum information during the re-write of the data. And this requires the drive to be powered.
I'm really not too confident about putting a flash-memory (or any other static electronic) device on the shelf, and coming back to it in a few years time and expect the data to be still there. I would be confident with a tape.
Memory technologies are moving along so fast that there is no chance to time-test any of them before they are obsolete. And accelerated environment testing that vendors claim to have done is not really a good indication of the data retention capability of a medium.
But then I would probably recommend carving the data on granite tablets if you want it to last millennia.
Re: Longevity of SSD as a medium
I came across a manual for Seagate pulsars a couple of years ago which had insight into this
"As NAND Flash devices age with use, the capability of the media to retain a programmed value begins to deteriorate. This deterioration is affected by the number of times a particular memory cell is programmed and subsequently erased. When a device is new, it has a powered off data retention capability of up to ten years. With use the retention capability of the device is reduced. Temperature also has an effect on how long a Flash component can retain its pro-grammed value with power removed. At high temperature the retention capabilities of the device are reduced. Data retention is not an issue with power applied to the SSD. The SSD drive contains firmware and hardware features that can monitor and refresh memory cells when power is applied."
Seagate rates their Pulsar to retain data for up to one year without power at a temperature of 25 C (77 F).
vs LTO (at the time) rated for 15-30 years.
Re: Longevity of SSD as a medium
"LTO (at the time) rated for 15-30 years."
Sorry, but this is an invalid comparison. An SSD is rated for at least a million writes per bit. The LTO is rated for 260 (yes, less than three hundred) full passes. If you only write to the SSD 260 times over the course of 15-30 years, it will likely not exhibit the "wear-out" phenomenon.
Re: Longevity of SSD as a medium
And how long will those bits remain on the SSD if I put it on a shelf?
That's the issue, because a backup you can't recover from is a waste of time, money and probably the end of the company that used it. That's why tape is still liked a lot.
Re: Longevity of SSD as a medium
flash retention rates depend not only on erase-based wear of cells, but also on crosstalk-like degradation from operations on nearby cells (even reads). in principle, if you wrote data once to flash (archival, like most tape uses), it would last on the order of 10 years. documentation of this seems fairly sparse, though, probably because that's not the main market. (flash all uses quite powerful ECC, which is fundamentally different from checksums...)
many people would not share your confidence of the retention rate for tape. it could be that we've all been warped by horrible performance of old generations of tape, but then again, that was always the explanation. (verify-after-write was a game-changing tape technology, for instance.)
Re: Longevity of SSD as a medium
hmm, flash is rated for much less than a million writes per bit (3k for common MLC, for instance). of course, ssd virtualizes that and covers the early failures using spare blocks. but it's completely mistaken to think that you can write an ssd a million times (fully, with uncompressible/non-dupe data).
"A simple trawl could send a tape-robot into melt down."
Only if you use simpleminded approaches to searches.
Indexing metadata is absolutely key. I can tell you what is on my tapes, WHERE it is on my tapes. WHEN it was recorded and what the SHA256 checksum is. No trawling needed to pull up XYZ file.
As for SSD: The currently quoted longevity figure is a 3 year shelf life. Drives flag themselves as "bad" when they estimate that the data obnboard will not be recoverable if left switched off for 12 months. (This is LONG before the drive becomes completely unusable)
Retaining business records for a long time is a legal requirement in most jurisdictions and your DPA request may well be trumped by Barnes-Oxley requirements, Inland revenue, etc etc etc.
Re: "A simple trawl could send a tape-robot into melt down."
I think the "trawling" refers to the fact that Google is in a particular situation where tape is not suitable. Google is in an industry where data essentially has an INFINITE shelf life and NEVER goes stale: someone could request ANYTHING...even data from 15 years ago...on a moment's notice. Plus, due to the way they work, they could end up having to gather data from who knows how many different locations and must do it tootsweet. For Google, everyone REALLY WANTS everything...YESTERDAY. Their business depends on it.
Retrieving 1 entry from a single tape may just be annoying, but (even WITH an index) imagine the stress involved when the robot has to change bunches of tapes just to build up 100 links from nearly as many tapes? Like I said, though, this is particular to Google's line of work.
Why trawl through tape??
Trawling through tape is the last resort, absolutely last, as in silly. And how "cold" is that cold storage? Is the data going to be accessed once a month, once ever six months, once a year, or is it supposed to be there, basically entombed?
It sounds like what you're after is a high-density jukebox using 60Gb UDO WORM media. Index that, and then let the arm grab the disc when you need it.
Bring back DECtape! gr
Tape for Longterm Storage
The case for tape keeps being pushed into bigger systems. It once made sense to back up a workstation on tape (now: hardly anyone does that), then servers (even now: a lot of people don't do that), then big data archive situations (where tape is clearly superior if you don't have to go back to them too often and can tolerate access delays).
I personally have been burned too many times by "Oh, we don't have a tape drive that can read those any more" to use them for longterm storage. Yes the disks cost 2x/byte but the disks include the drive and you can generally interface a disk drive to a machine for a long long time. Tape drives tend to be fiddly to maintain.
A data center can buy many tape drives, keep spares, have a tape drive repair person(s), maintain storage with optimum tape conditions.
But most of us can rarely do any of that. Even major universities are hard pressed to read some of the tapes from 20 years ago even assuming the data is still good after a researcher stored it in their attic since 2002.
Weirdly enough the cloud is likely to bring a tape renaissance because clouds are usually in large data centers where tape makes more sense than anywhere else. The great cycle of computing continues.
Re: Tape for Longterm Storage
Are you sure about "you can generally interface a disk drive to a machine for a long long time"? Have you tried to connect a SCSI disk (anything earlier than UltraSCSI, but try a SCSI-1 SE disk for a real challenge) to a modern server? And older interfaces like ESDI, ST506, MASBUS or ESMD are long dead.
Even more modern technologies like IBM SSA are now dead. IDE and EIDE interfaces no longer appear on modern motherboards. Even when older HBAs are still available, they will be PCI adapters, and these are being eliminated in newer systems.
I believe that the SAS technologies are expected to be N+2 compatible, i.e. a first generation SAS disk will work with a SAS 3 adapter, but there is no guarantee that they will work with later adapters. Given the speed of evolution of such things, I expect disks to remain portable to current machines for 5 or so years after manufacture, and after that, you will have to rely on legacy hardware to be able to read them. Does not encourage me to use disk as a long-term archive medium.
This is partly by design, as the disk and system manufacturers want to continue selling systems, and they have built-in obsolescence.
My personal thought is that you would have more success reading a 2400' 1/2" NRZI mag. tape at 800bpi recorded using ar or tar from 30 years ago that you would an early SCSI disk, especially if it came from a Netware, VAX, PrimeOS or other proprietary OS.
SSD vs Tape
For BACKUP Archive, only restored when the SSD or HDD fails tape wins
SS powered off data retention capability of up to ten years.
With suitable environment tape might last a Minimum of 15 years and up to 80.
"up to" is a weasel phrase.
Good luck though finding a working drive in 50 years ...
I gave away my 8" floppy drive to a CP/M enthusiast this year. And the disks :)
SSD is NOT Cold Storage.
An SSD can never replace a tape due to the fundamental physics of the flash devices that SSDs are made of. The charge in the cells used to define a 0 or 1 degrades over time. For example some flash devices the powered off retention time is a month. Therefore the SSD/Flash device will need to be powered on all the time. In addition the charge can be disturbed so the data must be scrubbed to prevent bit rot. So in summary for the cold storage back up everything forever tape or disk are it. Tape winning due to cost for a long long time.
The premise of the article is codswallop: tape is great as long as the use cases for recovery permit enough time to accomplish the recovery.
So it's great for true disaster recovery, and its great for "archival snapshots" (to show the rozzers, f'rinstance).
It's also great for the "we probably will never need this data, but if we do, it will be priceless".
And example of that can be found here: http://www.collectspace.com/news/news-111408a.html
They could either recover the data, OR go back to the moon. And since the use for the data was, err, going back to the moon, the value of the data that successfully recovered from those tapes is simply incalculable.
I mean, those tapes contained the original Earthrise.
External USB disks
We had this discussion last year, in the context of the cost of a petabyte storage system. This is a slight update to adjust the costs (cheaper disks) and to compare with tape.
An LTO-6 stores 2.5 TB (raw) and costs about $50, or $20/TB. An 4TB external HD, USB 3.0, costs about $150, or $37/TB. The bytes per cubic centimeter are about the same, and the HD cost continues to drop. The potential for compression is much better for disk than for tape, but I choose to ignore this because any compression scheme add complexity that may prevent the data from being recoverable 20 years from now.
I can build an archival storage system with 8 computers each supporting 32 of these drives, with switchable power for each drive. The total cost for the non-disk portion of this system is about $4000, so the system-level cost per petabyte is about $41,000.
This is basically a stack of 256 disk drives that are almost all powered off almost all of the time. Any given file can be accessed by turning the drive on and waiting for it to spin up, so access is about 5 seconds. In a backup/recovery system, you treat each drive more or less like an LTO, so you power up one drive each day and write to it for an hour or so (assuming you have 4TB/day to back up.) Just as with tape, you may choose to back up to two drives at twice the overall cost.
Disk lifetime is driven primarily by the amount of time the disk is powered up, so data retention in this system should be very long.
What's old is new
Optically marking the edge of tape for moving quickly to the data location (shades of optical sound-on film!) could save this technolofy from solid-state-oblivion; it might even work for tape still on the reel.
Depends on how you use the tape. Tape in streaming mode is pretty hard to beat as there is no latency.
Is it the right time to question the utility of backing up the twaddle people exchange on Twitface without regard to worth? How many "Me 2"s are you willing to spend hours pulling back from the great crash in the sky before admitting that it is a waste of valuable electricity?
Plenty of data that doesn't need to be accessed...but has to be kept
Most enterprise data isn't accessed after it's created, but a lot of it still has to be kept. That's where tape is ideal. Plus, there are plenty of disk-based LTFS products in the market that help reduce tape latencies and make management easier. At the lowest cost/GB for long-term storage...tape isn't going away anytime soon, espcially in markets like media and entertainment, HPC and research where files span GBs and single projects can consume up to a PB.
tape-ism is a worldview. for instance, many people will say that it's not a real backup or archive if it's not offline (usually their justification is that mistake or malice can more easily kill an online "backup".) if you rarely recover from archive, that colors your expectations as well: you are rarely exercising the tape, so may have an unrealistic estimate of the actual, silent failure rate. obviously if you more frequently recover from archive, you'll be pained by tape's latency (probably offsite, but even libraries are slow relative to disk seeks.)
in reality, people who take tape seriously write two copies. once you plug that in - the price, the data rate, the space, and factor in environment-controlled storage, offsite of course, and the fact that tape drives are expensive and don't last very long, and normally need a separate spooling facility. wow, costs do pile up.
it can probably still work well for very large, very sparsely-accessed storage. most people don't bite, though, and online, spinning storage for backup and archive really is the norm. simply being able to verify all your data is a powerful argument.
- Vid Hubble 'scope snaps 200,000-ton chunky crumble conundrum
- Updated + vids WHOA: Get a load of Asteroid DX110 JUST MISSING planet EARTH
- 10 years of Facebook Inside Facebook's engineering labs: Hardware heaven, HP hell – PICTURES
- Very fabric of space-time RIPPED apart in latest Hubble pic
- Massive new AIRSHIP to enter commercial service at British dirigible base