6 posts • joined 19 Sep 2009
I work for a tape manufacturer and found sitting in the AWS sessions at reInvent a bit bemusing. For such a huge company with so much technical capability, the marketing message was way off. They were targeting their tape obsolescence message toward small backup users with presumably backup applications that don’t know how to write to disk or file systems. A very old, very small, obsoleted market. What they didn’t address was the storage their services run on, nor the competition of other providers or end users building their own at lower cost points than can be achieved with AWS pricing. I think AWS will take over the consumer market. But big data sets, the enterprise, government? No way.
I agree with the often overlooked point in the AWS messaging, and in the Cloud story in general. Clouds still need physical storage and that won't be SSD. It will be spinning rust and tape. The majority of data in 10 years will still be on spinning rust and tape - regardless of the marketing name we have for the solution.
Cheaper to move the compute to the data - than the data to the compute
I moderated a panel at SC11 (high performance computing conference) last fall. It was research, national lab and engineering development customers involved. The topic was whether Cloud Archive is practical for HPC data (read as 100s of TBs to PB or even EBs of data). The entire room concluded after 2.5 hours of active discussion that Cloud storage for large data sets is not practical (economically or for speed). Cost of bandwidth, availability of sufficient bandwidth, restrictions on data access and vicinity of the compute to the data and metadata were all strong reason supporting keeping archive data local to compute. These are companies that keep and reference the large quantities of data in their archive. Nothing can beat large on premise tape systems for file large file archive. For smaller data sets measured in GB or TB and not being actively referenced a on a regular basis, Cloud archive seems very compelling.
Economics of archive storage very different for small data sets vs medium to large data sets
In the Active Archive Alliance and in wide varieties of tape market research from 2010 and 2011, it is pretty clear that when a customer has upwards of about 80TB of data to archive (not backup - archive data), tape (including the software and hardware needed to move data to tape) is less expensive than disk. The economies of scale spread the cost signficantly more to the advantage of tape as scale goes into hundreds of terabytes and into the petabytes. For less than 80TB - use disk for short term archive.
If the data set is larger than 80TB, or if data needs to be stored on appropriate medium for long-term storage, power efficient storage, offline and not susceptible to corruption, etc... - then tape is the right medium.
Object stores to tape...Integrated or 3rd Party
In my experience, many of the object stores don't use their integrated interface to move to tape. Instead there are 3rd party applications that are relied upon to extend the object store to tape. This is where the archive on tape becomes active and directly accessible to the object store. Many vendors resell one of these applications as their way to extend to tape vs integrating the capability directly into an integrated interface on the primary object store. All the products are tested and integrated together so work smoothly...just a matter of not letting the complete solution cost get too high so the value proposition of tape as cost effective on the CapEx side is still realized in addition to the long term savings on power efficiency and storage density.
The Demise of Tape - Long Predicted, Never a Reality
The over $1.5 billion tape market is anything but a charitable research project for the handful of tape manufacturers who invest in moving development of tape media and tape libraries forward into the future. Tape done right is profitable for manufacturers who know where to innovate and how to provide value to customers. Tape done right saves customers enormous amounts of money in power, cooling, data center space and IT administration time.
Over the past couple years, the midrange tape library markets shrank much less than the total economy; and, the large tape library market actually grew. When users look at the differences in tape today with LTO4 vs. the technology from 10-years-ago, they find an over 700% increase in tape media reliability and intelligent tape systems proactively identify faulty or overused media. The issues of shoeshining and broken leaders are nearly figments of bad memories from the past. Tape systems, from manufacturers that invest in tape R&D, are faster than virtually any disk system on the market (streaming at nearly 30GB/sec); they are as intelligent as most disk arrays with component failover, proactive faulty media identification, data verification; and, much more affordable on a cost/GB. The average small Enterprise customer can achieve not just a lower initial cost/GB acquisition cost for tape but also 20-25x cost savings each year in power and cooling for that storage.
Tape and disk technologies take at least 2 years to double in capacity. Often longer. As long as customers continue to see their stored data requirements growing at average rates of 2x a year and are required by compliance regulations to store that data longer, there will be no better media than tape – and – continued expansion of the LTO roadmap will be needed by IT, vertical markets and the government.