back to article Facebook: Why our 'next-gen' comms ditched MySQL

About a year ago, when Facebook set out to build its email-meets-chat-meets-everything-else messaging system, the company knew its infrastructure couldn't run the thing. "[The Facebook infrastructure] wasn't really ready to handle a bunch of different forms of messaging and have it happen in real time," says Joel Seligstein, a …

COMMENTS

This topic is closed for new posts.
  1. Anonymous Coward
    Joke

    Wow

    "Facebook was juggling about 15 billion on-site messages a month (about 14 TB) and 120 billion chats messages (11 TB) "

    That's a whole lotta Farmville updates!

    1. Elmer Phud

      Farmville?

      Farmville? (spits feathers and tries not to puke)

      Nah, it's my mates sending me YouTube stuff -- the standard Zynga chat tends to be monosyllabic bollocks.

    2. Anonymous Coward
      WTF?

      Farmville for Dummies - They got that right.

      Amazon are selling "Farmville for Dummies"

      http://www.amazon.com/FarmVille-Dummies-Angela-Morales/dp/1118016963/

      You couldn't make it up.

  2. Do Not Fold Spindle Mutilate
    FAIL

    Big fail for companies selling database software

    hi,

    What is actually being said here is Oracle, IBM and other commercially available software is so overpriced Facebook would rather create its own. Big fail for Oracle. Where are those super sales people taking the executive out to lunch and golf?

  3. Stephen Channell
    Unhappy

    Depressing

    I know that Facebook is big-big-big yardy-yardy-ya but this whole article makes Facebook look like look like one big post-grad project.

    Sitting in a cold Computer Science Lab, with no money, little kit and an unmovable demo deadline... this would be genuinely very-very impressive… “an effort to minimize data loss” clearly looked in very great detail at complex scenarios... but…

    The data volumes aren’t CERN LHC… but LHC gives a clue to the alternatives…

    Facebook is not poor, and they are not pushing Human knowledge into the unknown.. they could have simply stopped off at Oracle with a shopping list and splashed the cash…

  4. Anton Ivanov
    Thumb Up

    Unsurprising

    Putting large unspecified blobs of data into SQL is a performance killer for any SQL, MySQL included. SQL is storage for _STRUCTURED_ data, not for blobs.

    So Facebook's decision is not particularly surprising. There are technical solutions to this (isolate the structure, put in SQL, put blob elsewhere), but none of them is likely to scale to Facebook size.

  5. kissingthecarpet
    Linux

    Don't know enough about Oracle but

    maybe they've got the same or similar reasons as Google had for creating "Big Table". I assume you're familiar with a product Oracle sells that does the same thing at the same speed. I don't think CERN is a valid comparison, unless the particles are sending each other emails.

    I don't care for Facebook much either, but it does sound like they've got some very bright guys working there. Larry Whatshisname doesn't need more cash in his greedy maw either.

  6. Anonymous Coward
    Unhappy

    Depressing

    I know that Facebook is big-big-big yardy-yardy-ya but this whole article makes Facebook look like look like one big post-grad project.

    Sitting in a cold Computer Science Lab, with no money, little kit and an unmovable demo deadline... this would be genuinely very-very impressive… “an effort to minimize data loss” clearly looked in very great detail and complex scenarios but…

    The data volumes aren’t CERN LHC… but LHC gives a clue to the alternatives…

    Facebook is not poor, and they are not pushing Human knowledge into the unknown.. they could have simply stopped off at Oracle with a shopping list and splashed the cash…

  7. Anonymous Coward
    WTF?

    You've got the bit about making this up right...

    http://www.amazon.com/DougS-Farmville-Stratigies-Tricks-Helpfull/dp/1450227201/ref=pd_bxgy_b_img_b

    1. Anonymous Coward
      Thumb Up

      lol

      from the review: "still the book is good for begineers"

      begineers, I like that. I think you'll find mate, I'm not a beginner, I'm a begineer.

      1. mccp
        Thumb Up

        Title schmitle

        +1 for Begineer, I'm going to start using that one.

        Begineer = newly qualified software engineer (usually highly trained in AI and other useless stuff).

        It goes with "Vidiot" which is a term that we use to describe self-appointed experts who talk complete bollocks about anything to do with video signals or compression.

  8. Rogerborg
    Grenade

    Billions of dollarpounds changing hands

    Among talking-head gobshites spouting marketspeak without any inkling of what they're on about.

    And behind the curtain, maybe a dozen actual code monkeys on salary, wrestling with the cold, cruel realities of mmap() and fsync(). Rise up, brethren! Seize the means of production and overthrow your corporate mouthwhore overlords!

    1. Renato
      Grenade

      Billions of dollarpounds changing hands

      > cruel realities of mmap() and fsync()

      What mmap() and fsync()?

      AFAIK, there are no such functions in Java/PHP.

      Oh yes, maybe it's why they need a lot of iron.

  9. pan2008
    Thumb Up

    lessons learnt

    At least they realised mysql is not up to scratch. I am not familiar with they technology they are will be using, but I hope they won't make the same mistake and have to use another 65,000 servers because of wrong technologies.

  10. ben edwards

    Multiple copies of same message

    Am I the only one confused on why they're sending a clone of a message to the recipients? Cloning a 200kb message to 6 people, along with the backend to shard it out, takes way more resources over just letting those 6 people reference the original. If the reader "deletes" it, they'd simply be removing their link to the post, while still letting the 5 other readers get it...

    1. Eek

      Probably because most messages average 1k in size

      and it ain't worth it if you want to quickly write and retrieve it.

      With these sort of systems normalising data ain't worth the effort. Hard disk space is cheaper than cpu time.

  11. Liam 8

    @Do Not Fold Spindle Mutilate

    The commercial postgres house EnterpriseDB (http://www.enterprisedb.com/) created the GridSQL project (http://www.enterprisedb.com/community/projects/gridsql.do) which does a fairly good job of solving the large cluster issue Microsoft & Oracle completely ignore.

  12. Paul Johnson (vldbsolutions.com)
    WTF?

    A Pint To Anyone...

    ...who can demo a similar solution in Oracle.

    Don't make me laugh...

    1. Matt Bryant Silver badge
      Boffin

      RE: A Pint To Anyone...

      To be fair to Snoreacle, just because an app like Hbase is better for one particular and very specialised task, it does not mean it is suited to the much more common, commercial tasks that Oracle DB is suited to. And those commercial uses probably add up to far, far more licence income than Oracle would have made if it had just made Hbase. I'm sure Larry's response would just be one big "meh".

  13. Anonymous Coward
    Anonymous Coward

    Weird units here

    "Even before the new messaging system was rolled out, Facebook was juggling about 15 billion on-site messages a month"

    Useless measurement here, how about some real-world measurements, like 6000 messages (transactions) per second, globally.

    Most of those are just few bytes (less than a DB block, 4k, anyway) so a lot but nothing exceptional.

    Transaction tests have managed to squeeze 60 000 transactions per second from a Oracle server running on single PC machine in 2007. Not the same thing, lab tests are something else than real world, but gives us the scale hardware can cope with.

    1. OziWan

      Weird units here → #

      Large scale computing just does not work like this. There are so many examples of people who think they can apply 'enterprise' ready solutions to distributed web based apps and have failed - even with basic web sites (see the france.fr debacle for an excellent example).

      I mean you no ill will but have been working in this area for many years and you really do have to think in a different manner than simply transactions per seconds.

      Some simple examples (some mentioned already in the article). Think in terms of mtbf of kit when you are running a farm that contains as few as 1000 machines. Hardware failures become normal events that you have to engineer for. The idea that you can accept the 4 hours downtime per two years onb your database server and still hit you 99.9% SLA does not apply.

      What about backing up 100s of TB of data? A distributed model can eliminate the need for backups.

      latency - these database backends also need to return your profile page, your unread messages count and so. Cache solutions need to recalculate.

      Creeping death. One node in a cluster goes down, increasing the load on the other nodes and causing the whole lot to go down.

      Etc., Etc.

This topic is closed for new posts.

Other stories you might like