back to article FLAPE – the next big thing in storage archiving

The analytical wonks at Wikibon have deduced that a combination of flash and tape is better than tape alone or disk and tape for storing archival data. The argument is based on tape being not only cheaper than disk, but actually faster than disk for streaming large files. It also relies on the cost of flash approaching that of …

  1. DJO Silver badge
    Headmaster

    Say wha?

    “This [Flape] combination of technologies when used for long-term archiving can save IT departments as much as 300 per cent of their overall IT budget over the course of 10 years.”

    I spent some time seeing if I could come up with an even stupider way of saying "could save up to 30% of the budget" but failed, well done those chaps, good work but terrible writing skills.

  2. another_vulture

    Disk is a lot faster.

    It is trivially easy to increase the disk file transfer rate. Just stripe the files across multiple disks. In a Petabyte array, you can theoretically increase the speed by a factor of about 200 at almost no incremental cost, since you already have 200 drives, 200 controllers, etc. This also means that you can use cheaper 7200 RPM drives. Faster rotation increases database transaction rates, but striping increases bulk transfer rates. Tape simply loses in this regard.

    1. DJO Silver badge

      Re: Disk is a lot faster.

      You do realize that tape can be striped as well, totally negating your argument?

      Also you'd need some backup for the backup if you were spreading a file over 200 drives and that way madness lies.

      1. another_vulture

        Re: Disk is a lot faster.

        DJO: No there is no way to cost-effectively "stripe" tape. Striping works by using multiple drives simultaneously ("striping" by adding heads on one drive is a different discussion.) But tape drives are very expensive. The whole point of tape is to use a small set of drives to handle the entire set of tapes, and the logistics of the tape-handler robots will get very ugly very fast. The tape library usually has multiple drives to accommodate multiple simultaneous requests, not to do striping. But this is not the metric the article uses. When you look at multiple simultaneous requests, the controller-per-disk scheme is overwhelmingly superior.

        The contemplated disk scheme is a very conservative RAID 1/0, and is still massivly cheaper than tape. We can easily go to two separate RAID 1/0 for backup and still be cheaper. Where is this madness of which you speak?

        1. DJO Silver badge

          Re: Disk is a lot faster.

          "The contemplated disk scheme is a very conservative RAID 1/0, and is still massivly cheaper than tape."

          Absolutely, if you have a few hundred tapes. However if you have a few hundred thousand tapes the economics change.

          It does happen, I used to work for a well known telecommunications company in one of their many data centres and we had well over 1/4 million tapes spread between on and off site stores. There were also about 20 cassette loaded drives (12 tapes per cassette) and 7 sodding great robotic libraries (1 stand alone and 2 interlinked clusters of 3) each with lots of drives.Really we weren't as interested in speed as data integrity so we just made do with lots of duplication but that also did help the speed of data retrieval.

          We never had any problems with "the logistics of the tape-handler robots getting very ugly very fast" in fact the robotic libraries were a treat to work with, just load and forget (every bloody day), better than the cassette drives that ask for the cassette of tapes to be changed every hour or so. Got to have something for the tape monkeys to do.

  3. another_vulture

    $2,400,000 for 1 PB of disk??!

    That's simply silly. I can purchase a 4 TB SATA drive for $134.00, retail, quantity 1. 500 of these yield a redundant 1PB array for less than $150. Stripe them in sets of (say) eight (sixteen disks in a RAID 1/0 configuration and only spin up a stripe when I need it. has faster access time (limited by spin-up) and faster throughput (limited by stripe width.) You can still use the Flash for metadata.

    The Register had a article about the Facebook Open Vault specification That is more or less just this:

    http://www.theregister.co.uk/2013/09/24/facebook_on_the_rue_morgue/

  4. Anonymous Coward
    Anonymous Coward

    @another_vulture

    I think you mean 150 thousand.

    That aside, I think that RAID is a terrible way to do this. Rabin's Information Dispersal Algorithm is a much better bet. The maths behind it let you split a file into n shares, with any k of them being sufficient to reconstruct the original (with n >= k). The advantages are many, but one of the more interesting ones as it relates to this discussion is that it can blur the lines between cold and hot storage. If you have some number of shares (say 'h'), close to k in "hot" storage, then you only need to spin up k - h cold storage silos to reconstruct the file.

    With a regular storage system, I guess you'd want to cache the full replica once it's been recovered from the cold storage silos. With IDA, you can dynamically scale this by creating new shares, each of size 1/k and spread these out over your hot (cache) nodes. This is much more efficient than creating full-sized copies (replicas) both in terms of storage (entire system takes up n/k times the size of the archived data) and availability (since we can select any k silos for load balancing, or request from more than this and just use the first k to arrive back if you want to minimise latency).

    I actually use this scheme on my home network. I've got a k=6,n=10 system for archival purposes. Six of the ten machines/disks are usually asleep, and I've got 4 always on. So when I want to retrieve something I can choose which two of the sleeping machines to wake up (with wake on lan) or power up their drives (by a software signal). Once the machines are up, I just have to pull across two shares for each file I want (each 1/6th the size of the full file). I can cache these shares for as long as I think I might want to have access to the file. Admittedly, I don't have a fancy management interface for doing this (just scp wrapped up in some simple scripts), but it works really well for me, only takes up 1.6 times the original space (as opposed to RAID, which will probably be 2x) and still tolerates 4 (near-simultaneous) disk failures before the data is lost.

    RAID definitely has a place for things like OS disks and hot data (work files and so on) but I think that IDA definitely wins out for archival and "warm" (not so hot, not so cold) data.

    1. another_vulture

      Re: @another_vulture

      Yes, AC, I meant $150,000. actually, I was way off: $134,000 will buy 1000 of these disks, so pay half for the disks and half for the remaining infrastructure.

      Yes, RAID 1/0 is gross overkill for near-line storage. You can use your scheme or any of several others to build a system that is cheaper, faster, more power-effecient, smaller footprint, and probably better in other ways. This simply makes the articles $2,400,000 number even sillier.

      1. DJO Silver badge

        Re: @another_vulture

        It is rather generous but I suspect it includes controllers, cases, power supplies, cabling, replacements, an area within a building to stick it all, air conditioning, electricity and labour spread over a 10 year time scale. When you include all the costs over several years it is not a hugely excessive estimate.

        Considering just the purchase cost of the hardware is rather naive.

  5. Steven Jones

    "El Reg wants to know: Could a disk read/write head work on more than 1 track at a time? Wouldn’t that increase disk I/O bandwidth?"

    I theory yet, but in practice (with modern high density drives) it's not practicable. It's certainly not possible to put two completely independent head mechanisms on a disk due to vibration and air-flow issues. The other option, to have a single head with multiple read-write heads comes right up against the problem that the tracks are simply far too close together on disk to be written simultaneously. I suppose you might conceivably have some sort of staggered arrangement whereby the "parallel" tracks are being written a little distance apart, but I suspect problems will remain over heat dissipation, the size and weight of the head, error recovery and so on.

    The reason that reading/writing tracks on tape is practical is that they are much wide apart (and they don't have to be moved fast).

    nb. I seem to recall back in the days of fixed-head disks there were some that could work in parallel, although that may be my imagination.

    1. another_vulture

      Disks do have multiple heads

      A modern disk has one head per platter surface, so a modern high-capacity disk may have up to eight heads. However, since they share a servo they cannot be dynamically aligned to the tracks on each surface simultaneously at the new extreme track density ("shingled tracks") now coming into vogue. Earlier, it would have been possible, but it was not cost-effective because a bunch of relatively expensive read/write electronics is shared between all heads and would need to be duplicated, and the speed of the SATA interface would need to be (at least) quadrupled.

      It's also unnecessary, since an array of disks accesses multiple heads simultaneously.

  6. Anonymous Coward
    Anonymous Coward

    What a bunch of crap

    First question when you see anything from Wikibon is to ask "Who paid them for this opinion piece?"

    Second question: What are the web scale companies doing for long term data retention? As odd as it may sound, the Facebook's of the world are shifting away from tape and using optical storage.

    Tape formats require migration every 4-6 years (minimum) and with petabyte-scale retention, tape is not practical at all (unless you are the tape vendor in which case its payday!). You don't have that problem with next-gen optical storage. Just ask Facebook.

  7. Anonymous Coward
    Anonymous Coward

    Also keep in mind that this is about large, sequential reads. Once you start to dive into smaller files, multiple accesses, and so on, the performance will drop on the flape side. On the other hand, it was mentioned. I just didn't see any of it in the comments.

  8. another_vulture

    Disk cost

    DFO: no, the ridiculous $2,400,000 did not include those additional costs. Those costs are in a different line in the matrix in the graphic, and they are also very high relative to the same line item for tape. The only way you can possibly get to these numbers is to buy very fast, very small disks, and those are completely inappropriate for near-line storage.

    With respect to scaling: one post mentions that tape cost does not rise as quickly as disk as the archive size increases. But this article is specifically about a 1PB store.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon