back to article Mmm, yes. 11-nines data durability? Mmmm, that sounds good. Except it's virtually meaningless

What do data durability numbers mean? Azure brags 12 and even 16 nines durability, while Amazon S3, Google Cloud Platform and Backblaze tout 11 nines. What does this mean? Data durability is a fancy way of promising you'll keep someone's data intact, and not allow the bits and bytes to degrade through media decay, drive loss, …

  1. Anonymous Coward
    Anonymous Coward

    smoke and mirrors

    And if/when one of those low probability events happens, I will lose *one* file, or I will I lose *all* of my files on that node/instance/unicorn/<whatever cloudy instance thingy name>?

    n-nines makes some sense for uptime discussions because you can relate it to a quantity of downtime. Your hosting provider claims a "5 nines guarantee" but they were down for an hour yesterday = they failed, read your SLA to see what compensation you are entitled to. Your cloud storage provider claims some made up number of nines and you lost data = what exactly (besides you're screwed if you have no other backup).

    1. Pascal Monett Silver badge

      Re: smoke and mirrors

      There is practically nobody today that can actually prove 6 nines uptime - especially not The Cloud - so 12 nines is just horseshit, and let's just forget that mention of 16.

      1. Anonymous Coward
        Anonymous Coward

        Re: smoke and mirrors

        The Cloud is 11 nines reliable - it's just that the guy plugging the cables into the router is only about 3 nine to the wind on Friday after lunch. So they can claim it - the complete unavailability of notional terabytes is the sub-contractors fault, not theirs.

      2. Anonymous Coward
        Anonymous Coward

        Re: smoke and mirrors

        This is data durability not data availability. So the data is still there but you cant get to it for a while. In theory.

      3. SPGoetze

        Re: smoke and mirrors

        We're talking DURABILITY here, not AVAILABILITY!

  2. Phil W

    Statistics Vs Reality

    Statistics for probable data loss are all very well but they don't take account of the most important rule of data loss/recovery.

    That is, if the statistics indicate that you may lose a file once every 8 years then it will happen at the worst possible time and/or the file will be the worst one you could possibly lose. AKA Sod's law.

    1. Sir Loin Of Beef

      Re: Statistics Vs Reality

      Sod's Law?

      As in ;'Sod off, I've got my own problems'?

      1. jake Silver badge

        Re: Statistics Vs Reality

        Sod and Murphy were cousins. Back-room whispers suggest they may have both been fathered by Finagle, but I think that's just a vicious rumo(u)r.

      2. FrankAlphaXII Silver badge
        Facepalm

        Re: Statistics Vs Reality

        Probably more like: Poor sod, you've got a lawsuit headed your way.

        And you only had one backup with one cloud provider and nothing else? Well bless your heart (Southeastern USAin for "You're a moron")

    2. Alan Brown Silver badge

      Re: Statistics Vs Reality

      The other reality is that the way the cloud data stores are generally structured (freeish to store, expensive to read) you'll be bankrupt long before you hit a read error

  3. JeffyPoooh Silver badge
    Pint

    This sort of thing is explained here:

    'The Black Swan: The Impact of the Highly Improbable'

    by Nassim Nicholas Taleb

    If you haven't read it, then it's probable that you really should. It explores and explains all this sort of thing. If you haven't hoisted the concepts aboard yet, then it's likely that you'll be quite confused by attempting to think about such improbabilities.

  4. Frederic Bloggs

    Pratchett's Law

    We all know (especially when trying to restore from "backup") that that 1 million to 1 chance happens 9 times out 10...

  5. Ken Moorhouse Silver badge

    Referential Integrity from a database point of view...

    Is different to the way we, as humans, keep track of "data linkages".

    We might associate events relative to other events, rather than to actual dates, and refer to them in this informal way, which may initially be ok, but, in the long run to cause events to go walkies, date-wise.

    To pick a fairly obscure example, take Edward III. Instead of referring to the absolute years he was on the throne (1327 onwards), people would refer to the year in terms of Regnal Years (1327 was his first Regnal Year), so documents would quote this, rather than the absolute year. Fine for most kings and queens of England, but Edward III had two regnal years, one for his English monarchy, the other for his time as the French monarch (1340 was his first French Regnal Year). Not a problem if the context is included in any reference to a document, but if that context is not provided, the error can be substantial.

    So what? Those that think they've got Digital Bit Rot should think themselves to be relatively lucky.

  6. JohnFen Silver badge

    Vendor data

    It sounds like people are taking vendor performance promises far too seriously here. Surely we haven't forgotten what those are worth, have we?

  7. redpawn Silver badge

    Do they specify the retrieval time?

    It could take a 11 zeros worth of days to retrieve you data from its perfect data storage vault.

  8. bobsmith2016
    Headmaster

    "acts of God"

    Being pedantic here, but please don't use "acts of God".

    Gives undue succour to people who believe in nonsense. Natural Disasters, or a similar synonym is perfectly good.

    1. Frumious Bandersnatch Silver badge

      Re: "acts of God"

      I like to use "acts of Gods". Also, whenever anyone says "For Gods' sakes!", I compliment them on their catholicism (small "c").

      1. Alan Brown Silver badge

        Re: "acts of God"

        "For Gods' sakes!"

        s/Gods/ghods/

        and I shouldn't need to explain that :)

        I've been known to invoke Zeus, Odin and Vishnu in the same curse and more latterly to simply wish that someone's Youtube videos buffer for 1,000 years.

    2. eldakka Silver badge

      Re: "acts of God"

      Being pedantic here, but please don't use "acts of God".

      Gives undue succour to people who believe in nonsense. Natural Disasters, or a similar synonym is perfectly good.

      "acts of God" in this context is actually a legal term, not a religious one, that used to, if not still currently does, appear in contracts, especially things like insurance contracts.

    3. Lord Elpuss Silver badge

      Re: "acts of God"

      Dear bobsmith2016

      With the greatest respect, ODFO.

      People believe what they want to believe - and that is their right. It's not your business, or anybody else's, to dictate what YOU believe and/or to disparage their faith. By the same token, you can believe/disbelieve whatever you like - just don't go around imposing it on others.

      From a proof perspective, believers can't prove what they believe in (at least - not in a way that you would accept), and YOU can't prove the opposite. You're no more 'right' than they are.

      PS 'Act of God' is these days a generic - like 'Hoover' or 'Google'. It's used in legal, formal and semi-formal contexts as a synonym for 'Something over which we have no control', and implies no particular 'belief'.

  9. Frumious Bandersnatch Silver badge

    Umm...

    Sigma(σ)?

    https://en.wikipedia.org/wiki/Standard_deviation

    I think that some people can understand this, even if marketing (and the article author, apparently) cannot. Or decide not to.

    1. MJB7

      Re: Sigma(σ)?

      The standard deviation is a useful and well-defined concept for a normal distribution. However we are not dealing with normal distributions here - more like poisson (with a *very* low probability). The result is that standard deviations are not particularly meaningful (although, to be fair, 11-nines isn't either).

  10. Frumious Bandersnatch Silver badge

    "Checksumming is another"

    FTA: another "ways to lengthen the data durability time".

    And how do checksums "lengthen the data durability time?" I'm not just being pedantic. I'm an advisor to Amber Rudd and we'd really like to learn more about the hashtags.

    1. SPGoetze

      Re: "Checksumming is another"

      With checksums you can find different types of "silent" data corruption, like

      - bit rot

      - lost writes

      - torn writes

      - misdirected writes

      Torn/Misdirected writes are more of a spinning disk thing, the rest could theoretically also appear in Solid-State devices.

      When you determine, that the checksum doesn't fit the data (either when you access/read the data, or on a periodic schedule), you'll reconstruct the correct data from parity/erasure code/mirror or whatever your protection scheme might be, therefore "protecting the data from "silent" corruption (not failed drives) and increasing durability...

      Hope that helps.

      (I teach storage...)

  11. Rob D. Bronze badge
    Facepalm

    An object by any other name

    Tell the salesperson you have about 1,000 billion objects defined (each object is a collection of eight tightly coupled binary indicators representing a range of numeric states which your systems will process in groups of varying sizes) and ask them to run the maths on how many objects the vendor will lose every year.

    When they confirm that they will lose one or two of these objects at least every year even with their precious 11-nines durability, argue about the reliability of the service and begin the negotiation on the price for the terabyte of storage being requested.

    1. Allan George Dyer Silver badge

      Re: An object by any other name

      So they set their pricing per object, and your beancounters insist on your entire data store being one object. What could possibly go wrong?

    2. veti Silver badge

      Re: An object by any other name

      This is what I was thinking. You don't have to lose a whole "object" to be screwed, the corruption of a single byte can do the job.

      And since "objects" are commonly highly interdependent, the corruption of a single object could quite easily render your entire backup useless.

  12. Androgynous Cow Herd

    heard from Salesdroids

    one claimed nine Fives of availability.

    1. jake Silver badge

      Re: heard from Salesdroids

      A salesdroid that was telling the truth?

      Anybody check the thermostat in hell recently?

    2. Allan George Dyer Silver badge
      Coat

      Re: heard from Salesdroids

      This isn't an urban legend, the moment when the salesdroid told his colleague, and the fix were recorded for posterity.

  13. frobnicate
    Facepalm

    Oh, man

    > Erasure coding is one such method. Reed-Solomon coding is another.

    Reed-Solomon coding *is* erasure coding.

    > The second way is to store multiple copies of the data across multiple locations.

    And so is N-way replication too! They are all erasure codes.

  14. tiggity Silver badge

    data access

    With all the cloudy outages, your data may be (reasonably) intact, but not much consolation when you & your customers cannot access it...

  15. David Nash Silver badge

    Uptime is understandable, but why would they ever lose small amounts of data? The provided should have proper backup and redundancy in place which means they never lose anything, ever (subject to a certain amount of downtime), or they lose everything when they go out of business. How is there any in-between? Sounds like pure marketing to me.

    1. Killfalcon Bronze badge

      Not sure. I assumed, to be honest, that they're talking about how often you'll need to go to that backup, or possibly the chance that the backups will have an error.

  16. Claptrap314 Bronze badge

    Reliability numbers

    First, 6+ 9s of reliability is a very real thing in certain environments. In fact, one group at Google has to make a point of occasionally taking their service down as the only practical way to ensure that G's systems don't get built assuming that the service is never down. (To my knowledge, that team had 0 unintentional downtime during my 16 months at G.)

    For long term storage? I would pick the group with the _lowest_ claimed number & go with them as the group least infected by fantasy thinking.

    Storage resilience is like weather forecasting--does 20% chance of rain mean 100% for 20% of the region, or 20% chance for 100%? Both are completely correct, but the effects are vastly different. An email can survive a lot of damage with an excellent chance of still being serviceable. A contract, less so. Alignment data--none at all.

    There are well-understood methods for refreshing data. I can believe eleven or even sixteen nines of resilience per byte at the level of racks. What I cannot believe are claims that those numbers ares sustained when the time comes to replace the racks.

    Failover systems will fail. Dispersed assets will be independently compromised. Individual bad actors will compromise systems. And business fail. Sixteen nines has to ignore all of that.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2019