According to published plans, the all-conquering LTO tape format has no future after LTO 6, which is expected in 2012. This is ridiculous and there must be a secret roadmap for LTOs 7 and 8. The LTO tape format dominates the open systems tape business, is used increasingly by supercomputer customers and is making some inroads …
..it's a simple case of economics. The likes of NASA, yourself et al can rant and rave all they want but if, in the future, the makers see no profit or even a loss in the business, then why should they spend millions of dollars developing the system.
People often forget businesses are there to make money for themselves and their shareholders, not to act as a charity to hard done by multi billion dollar science projects.
Dedup and HPC not useful.
Deduplication and HPC/science are not a useful fit. You migt have a lot of reduncancy in commerce, but there is no reason to imagine that there is any true duplication of data coming from such sources as remote sensing, the LHC, or other large data sources. It is simply the nature of the beast.
Some HPC related data is the result of deterministic simulation runs, and there comes a point where it is cheaper to re-create the data on demand, on a faster machine, than to store it. But that gap is probably about tens years, so makes little useful dent in the storage needs.
The nature of science data also requires different access patterns. The big reason you need to store science data is in order to mine it, and often the science is in the application of new search and mining methods to old data. Once you get past the value in the most recent data, the entire data set may see a constant buzz of access. However such access is often very predictable, allowing forward fetching of tapes and caching onto disk. So again, different needs to commerce.
Of course there is no roadmap after 2012
The absence of a LTO roadmap after 2012 just shows us that the LTO users community perfectly knows that the universe will in fact end in 2012. After all, they are NASA, CERN, and other scientists. They obviously know something that we don't.
Not a charity
So there are a handful of organisations that use supercomputeres that generate volumes of data which are logistically difficult to deal with on the prospective LTO6 and the question is what are they meant to do if the LTO roadmap stops at LTO6? Well, in a breaking piece of news, I hear that in a real shocker that the storage supplier market isn't driven by charity designed to meet every requirement. It is, believe it or not, driven by businesses that will seek a return on their investment. There may well be a handful of organisations in need of tape backup solutions which scale into the tens of PB without taking up warehouses full of cartridges. However, such a handful of organisations does not necessarily make up a viable market.
To put (say) a requirement for a gross 100PB into perspective (allowing for redundancy like multiple copies, disaster copies and son on) with 3TB LTO6, then that's about 33,000 cartridges. Which is a lot, but there are systems out there where individual tape libraries scale to approaching 10,000 cartridges. Most of those 33,000 cartridges will be duplicated offlined somewhere for archive and safe keeping. It's a bit of a logistical problem of course - archive copies have to be regularly refreshed as there simply isn't a long-term archive storage medium that can be trusted.
Of course it's very unlikely that data is processed direct from tape, apart from the initial slection. The days when data was sequentially processed direct from tape are probably gone for the vast majority of organisation - the data has to be brought online to disk first. There is the real problem - as many people struggling with very large data volumes can tell you, it's the transfer of data to and from tape that is often the biggest bottlneck. The throughput of an LTO4 tape is such that is considerably exceeds the throughput capability of any single disk (apart from SSDs). It even stretches the capability of many RAID controllers. Start running several LTO4s flat out at the same time, and it doesn't take many of those to saturate many medium sized storage arrays. Of course the supercomputer user will have all sorts of highly specialised parallelised and distributed storage systems over their cluster to handle the huge I/O rates. However, that's not a route open to most companies and consequently the market for huge tape drives with many hundreds of MBps each is very limited.
The problem with tape is often not the basic capacity, or even the sequential throughput, it's that issue of moving data between nearline and online storage media. Hence LTO is concentrating on capacity and not throughput.
One more calculation - 100PB per year is an average of about 3GB per second. If we assume that all data is accessed 10 times per year, that's 30GB per second or about 120 LTO6 drives. Double that up for 50% util and we get about 250 drives. If you are an organisation that has this much data, then 250 drives and libraries isn't an impossibly large bill - but coming up with the supercomputer that could deal with the implied peak bandwidth of > 120GBps (assuming moving between disk and tape) is going to be some engineering challenge.
I can see a problem
If capacity doubles (and needs to double) but speed only increases by 20% at each iteration it seems fairly clear that the solution is unworkable in the long term.
The problem is already there with disk - if you double the linear bit density that quadruples the capacity of a given area of disk. However, if you use the same form factor and spin speed then sequential access rate only doubles.
In consequence, one of today's 1TB disk drives takes several hours to read all the data from end-to-end. Ten years ago that operation could be done in, perhaps, 30 minutes. Anybody who has had to rebuild a RAID set with very large disks can tell you all about how long that takes - perhaps 24 hours on some arrays.
In the past tapes have improved on this a little as more and more heads have been added to work in parallel (not a practical thing to do with disks). However, there are practical limits on the number of heads on tapes as well, and consequently whilst they have crammed more tracks into later generations of LTO they just do more passes over the cartridge.
Whatever anybody tells you, disks are really sequential access devices. What they have is (compared to tape) very short seek times to a given bit of data, but ultimately they just read and write data in one sequential stream.
Lee has a good point.
Capacity doubling and bandwidth marginally increasing with each iteration.
The drives can be distributed to help overcome this.
LTO and other Enterprise tape technologies will continue to be made and developed while there is still a market. If so, and given its still technically possible there would be progress beyond 2012 towards LTO 7, 8, 9 and beyond.
LTO and other Enterprise tape has made tremendous progress over recent years and it is interesting to see if this continues.
Steven Jones also makes a good point. Media servers are often the bottleneck and not the write speeds of the tape drive arrays themselves.
Its interesting to watch progress of the SSD price / performance and capacity curve.
Quiescent tape energy is still better though!
MLC flash as replacement for tape?
I think if the $/GB of 4-bit per cell or higher MLC flash can be brought down quickly enough, it will be very competitive with tape in the near term (2012 timeframe). Poor MLC flash write endurance won't be as much an issue when used for WORM or tape-like applications - after all only 200 full overwrites durability is quoted on LTO4 today (and IRL it can be much less, with excessive shoeshining, leader pins snapping, etc.). And flash of course has similar power requirements, form factors, etc. as tape (plus tape drive), and would likely have similar if not better long term archival reliability. So perhaps the LTO consortium is looking at flash removable drives instead of tape as a long term replacement?
The Demise of Tape - Long Predicted, Never a Reality
The over $1.5 billion tape market is anything but a charitable research project for the handful of tape manufacturers who invest in moving development of tape media and tape libraries forward into the future. Tape done right is profitable for manufacturers who know where to innovate and how to provide value to customers. Tape done right saves customers enormous amounts of money in power, cooling, data center space and IT administration time.
Over the past couple years, the midrange tape library markets shrank much less than the total economy; and, the large tape library market actually grew. When users look at the differences in tape today with LTO4 vs. the technology from 10-years-ago, they find an over 700% increase in tape media reliability and intelligent tape systems proactively identify faulty or overused media. The issues of shoeshining and broken leaders are nearly figments of bad memories from the past. Tape systems, from manufacturers that invest in tape R&D, are faster than virtually any disk system on the market (streaming at nearly 30GB/sec); they are as intelligent as most disk arrays with component failover, proactive faulty media identification, data verification; and, much more affordable on a cost/GB. The average small Enterprise customer can achieve not just a lower initial cost/GB acquisition cost for tape but also 20-25x cost savings each year in power and cooling for that storage.
Tape and disk technologies take at least 2 years to double in capacity. Often longer. As long as customers continue to see their stored data requirements growing at average rates of 2x a year and are required by compliance regulations to store that data longer, there will be no better media than tape – and – continued expansion of the LTO roadmap will be needed by IT, vertical markets and the government.
RAIT is already a reality in top end backup solutions. It solves bandwidth problems fairly nicely and will for quite a while.
The only annoying thing about the LTO roadmap is the forced read-only compatibility with 2 generations back and nothing at all 3 generations, even when technically possible to do so.
IMO SSD will completely eat the sub-1TB disk market in less than 3 years and eclipse capacities greater than 2Tb shortly afterwards - but nothing matches the _proven_ endurance factor of tapes for the moment. (We test samples of our backups regularly to ensure the tapes are readable, do you?)
Sales droids rave on about how disk makes tape redundant. and how the d2d backup systems can fit in our existing server rooms. I love to watch them backpedal when I ask the questions "What happens in a fire?" "What happens when someone types 'rm -rf /' ?" and "What happens when some script kiddie deliberately destroys your disk backups?" - I've seen all three cases occur and it's a lot harder to trash a set of backup tapes tucked away securely in a data safe
(Part of a decent data safe's specification is to survive a 10 metre drop mid-fire. Most disk drives would be mechanically damaged by such an event. Tapes don't care)
- Geek's Guide to Britain INSIDE GCHQ: Welcome to Cheltenham's cottage industry
- 'Catastrophic failure' of 3D-printed gun in Oz Police test
- Game Theory Is the next-gen console war already One?
- Analysis Spam and the Byzantine Empire: How Bitcoin tech REALLY works
- VIDEO Herschel Space Observatory spots galaxies merging