back to article So what exactly sits behind Google’s Nearline storage service?

How is Google’s retrieval service for non-essential data, Nearline, with its three-second retrieval latency, viable at the same cost as Amazon’s Glacier, when it uses tape with a 3-5 hour retrieval latency? Tape is cheap – but slow – and Google can't be using the stuff for Nearline. We think Nearline uses either Blu-ray …

  1. Peter Galbavy

    3 seconds sounds an awfully close to the spin up time of a disk using minimal power and AC.

    1. Anonymous Coward
      Anonymous Coward

      MAID plus analytics

      Yes.

      Note also the very limited retrieval bandwidth. Conclusion: it's MAID with highly restricted power and I/O provisioning such that only a fraction of the disks in a given rack can be powered up at any time, and only enough bandwidth is provisioned to satisfy that fraction of the disks.

      And then there's almost certainly a smart / predictive software layer that is distributing objects across the storage to maximize parallel retrieval of an object when it's requested and to minimize request contention.

  2. Bob H

    You could deal with RAID and resilience by writing it in parallel to RAID-0 with extra CRC and then Blu-ray. The RAID-0 gives you the speed you need to complete the medium retrieval times and the Blu-ray gives you (more) assurance that you can recover the data in a failure.

  3. Jim O'Reilly

    Google uses their scrap drives

    This is my hypothesis.

    Google cycles hardware on a much faster rate than most corporations. Hardware has a typical 4 year life. what do you do with the old stuff...it's starting to increase failure rates, right. Meanwhile the market for old computers is slow, so selling them isn't to good an idea.

    The answer is to use older gear for bulk cold storage where it is rarely powered on. This extends drive life and reduces power drastically. And the nice thing is there's no installation costs!

    The economics are compelling. There is no acquisition cost since the gear is depreciated.

    The big question is what happens when they run out of old gear!

    1. Snowy Silver badge

      Re: Google uses their scrap drives

      Considering they are always buying some more new gear not sure they will run out of old gear.

    2. Anonymous Coward
      Anonymous Coward

      Nope Re: Google uses their scrap drives

      I can tell you definitively that Google is not using old drives (and neither is Amazon). The costs associated with failure rates far outweigh any benefits.

    3. DeepStorage

      Old gear won't do

      While the shift the old servers to the DR site model works for SMBs that have a fixed IT staff, and therefore fixed operational costs, it doesn't work at scale. Just the extra data center space, power and cooling for servers with 1TB hard drives of 2011 vs 8TB hard drives today would make buying new kit worthwhile. Add in that those higher failure rates mean more guys in the data center doing break/fix and occasionally breaking something else and old gear isn't cost effective.

      The space/power/cooling is why they turn over gear faster than average in the first place.

  4. El Limerino

    Glacier is tapeless...

    What makes you think Glacier is tape-based? Former AWS employees confirmed it's high capacity, low RPM hard drives that are spun down when not in use back in 2012.

    Similarly, GOOG says it's storage is disk-based, and not a different disk hardware stack.

    1. Anonymous Coward
      Anonymous Coward

      Re: Glacier is tapeless...

      This. Everybody has an elaborate "how could they possibly do that?" theory about Glacier, but the reality is a lot simpler. However, anybody who actually knows isn't saying (because they are under a strict confidentiality agreement).

  5. SImon Hobson Silver badge

    Could well be using the HGST drives

    Using either of the shingled drives needs filesystem support - while the Seagate drive will "work" without, it hits certain performance issues (like - stops doing anything for a while while it shuffles data) if you don't have an overlying file system that understands it.

    Since Google have the brains to have their own filesystem anyway, there's just nothing at all to say they aren't using either of the shingled drives for this. Actually, for the way their storage works, the shingled drives are probably a good fit.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019