back to article Free Riak database acts like depressed teenager to assure data reliability

Basho's NoSQL Riak database has been given an upgrade that makes it question its own integrity at all times – a tedious trait in people, but a handy one for assuring data preservation in massively distributed information stores. The "active anti-entropy" feature in Riak 1.3, which was released on Thursday, means the open …

COMMENTS

This topic is closed for new posts.
  1. Magani
    Happy

    "...database acts like depressed teenager..."

    Just what we need. A database that replies to queries in monosyllables, uses the floor as a.storage area, refuses to operate before midday and is unable to carry out its own garbage collection.

    Sometimes they do turn out OK in the end, though.

    1. proto-robbie
      Childcatcher

      Re: "...database acts like depressed teenager..."

      Teenagers with OCD. Just the sort of bods we need in IT. Send them over to networks and security.

  2. Anonymous Coward
    Anonymous Coward

    Addressing the symptoms, but not the cause?

    A database that needs an agent bolted on to fix "divergent, missing, or corrupted " data doesn't sounds like a database I'd want to rely on in the first place.

    It implies the failures don't matter in the period before the agent gets round to fixing them.

    And it implies the agent will have a 100% success rate in fixing things.

    But maybe I'm more fussy about my data that their users?

    1. Anonymous Coward
      Anonymous Coward

      Re: Addressing the symptoms, but not the cause?

      Long time since I played a db geek on TV let alone on a terminal ... but as it's a large scale distributed database it' isn't unreasonable that at a given instant it may not be 100% consistent since multiple replicas are involved, some of which may be unreachable at that time, near-simultaneous updates may occur, etc. And since it's expressly a NoSQL approach this the lack of full ACID behaviour is the tradeoff being accepted to achieve higher performance. There are plenty of applications for "good enough" integrity (for sufficient values of "good enough": glibly put if your bank uses a NoSQL store for your account tell them to sod off, if they use it to store records of the last time they nagged you with a new pointless product then also tell them to sod off (but for the intrusion, not the software engineering :-) )

      But I do agree that it would be rather more reassuring if all candidate problems were being journalled and repaired ASAP rather than what reads as cron-ing the fix task. No idea how common this approach is, whether it's the lesser evil for preserving performance, etc - hoping for some proper db geek comments...

    2. jtuple
      Big Brother

      Re: Addressing the symptoms, but not the cause?

      (Note: I work at Basho (makers of Riak), and was the lead developer on this new Active Anti-Entropy feature.)

      Riak is a database. You put data in, you get data out. There are no inherent inconsistencies in Riak, nor any built-in symptom that needs to be repaired.

      Yes, Riak is an eventually consistent database, rather than a strongly consistent database. The AP option rather than the CP option from CAP. But, this has nothing to do with durability or data safety. AP vs CP determine the types of applications that a given database can support. Certain applications can tolerate eventual consistency and can use Riak, others can't and should use another database. Or wait until Riak supports strong consistency in addition to eventual consistency later this year.

      The new Active Anti-Entropy (AAE) feature has nothing to do with eventual vs strong consistency. It's a self-healing feature designed to address hardware failure and other scenarios outside the control of the database. It's similar to RAID, but built into your database.

      Riak is a fault-tolerant, replicated database. When you write data, it's stored to multiple machines (3 by default). Thus, you can lose machines and still have readable replicas. If a machine fails, you're going to end up with missing data (replace failed hardware w/ new, empty hardware) or divergent data (replace + restore from recent, but not 100% current backup). When you read an object, the request will be sent to all replicas, returning the non-missing/non-divergent (ie. correct) data. Riak, however, notices that one of the replicas is missing or divergent and asynchronously repairs it from the data on the other replicas.

      The issue with this approach is that data is only repaired on reads. The new AAE feature augments this approach to provide a lightweight background process that's constantly verifying replicas and repairing things as necessary. Thus, even cold data that is never read is verified and repaired. The aim is to make sure all data is repaired before any other nodes fail. Sure, multiple failures could happen before everything is repaired and you're toast. This is no different than any other database (log replay/recovery) / RAID. There's always a chance additional disks will fail while you're in the process of rebuilding your RAID array. It's just statistically rare enough that we can all still sleep at night.

      In any case, AAE isn't designed to solve an inherent problem with Riak. It's designed to help regenerate your data when hard drives, nodes, etc fail. It's also designed to detect silent data corruption (faulty hard drive / controller), an issue that effects all databases.

      In short, AAE is similar to the protection provided by triple-mirrored ZFS, but at the node rather than hard drive level. ZFS maintains a hash tree for all data stored in the filesystem. On every read, the replicas are verified against the hash stored elsewhere on disk. If there's a mismatch, the bad replica is repaired from replicas on the other disks. As was famously discussed around the time ZFS appeared on the scene, this helps protect against silent data corrupt / bit rot (eg. when your hard drive, disk controller, etc corrupt data w/o any indication). The problem is that this verification only happens when data is read, thus cold data is never checked. The solution: a cron job that runs 'zfs scrub' periodically to verify all data on disk. Riak has similarly always done a verify/repair check on every read. The new AAE feature adds a smart, lightweight 'zfs scrub' equivalent that continuously verifies all data all the time.

      Riak is designed to be an operations-friendly, fault-tolerant database. A database designed to easily scale-out as needed, that can tolerate multiple node failures and network partitions (eg. switch failures / split brain scenarios). Adding built-in self-healing / protection against silent data corruption from faulty hardware was a logic next step. Of course, Riak is a harder database to develop for, requiring eventually-consistent tolerant algorithms/designs. But, that's the great part about the "new database" or so-called NoSQL (*cringe*) movement: there's different products for different use cases. As a co-worker of mine use to say, thesedays databases are like D&D classes: use the right one for the quest; sometimes you need a rogue, other times a mage is best.

      (Big Brother, cause AAE is watching your data...)

      1. Anonymous Coward
        Thumb Up

        Re: Addressing the symptoms, but not the cause?

        Yay! a DB Geek by return of post...thanks for the detailed follow-up. I hadn't thought of undetected corruption in the lower stack/hardware as a factor; certainly nixes journalling likely problems, etc.

  3. Destroy All Monsters Silver badge
    Paris Hilton

    Listening to Linking Park?

    So this is all about how to patch up things after the CAP theorem struck and made things go haywire?

    Any numbers on how much of such a database is in an inconsistent / non-replicated state? The whole approach sounds bizarre and wild, something I might have come up with in my earlier years...

This topic is closed for new posts.

Other stories you might like