back to article IBM reveals secrets of Watson’s Jeopardy triumph

IBM has explained the principles behind how its Watson machine bested the world’s finest Jeopardy players, even if it can’t handle Siri. In a lecture at the University of California at Berkeley, IBM research scientist Eric Brown outlined the history of the project, and provided some details about how Watson was able to sort …

COMMENTS

This topic is closed for new posts.
  1. Destroy All Monsters Silver badge
    Paris Hilton

    Jeopardy is bizarre..

    "Its largest airport was named for a World War II hero; its second largest, for a World War II battle"

    That is not really the answer to the question

    "What is Toronto?"

    The real question would be

    "What is interesting trivia about the airports of Toronto?"

    and that's what the candidate, including HAL, I mean, Watson, should output.

    If one is just interested in finding the correct US city that matches the factoid, one would just say "The answer is: Toronto", which makes this an open-ended version of "Who wants to be a Millionaire".

    So how is the utterance "What is Toronto" even considered a correct solution?

    Inquiring Paris is inquiring.

    1. Ian Michael Gumby
      WTF?

      Huh?

      "Its largest airport was named for a World War II hero; its second largest, for a World War II battle"

      That's the question, the correct answer is "What is Chicago?" (O'Hare ORD) and (Midway MDW).

      The whole point to the game show is to state a meaningless factoid in a specific category.

      Not sure what you're going on about with your question...

      1. fatchap
        Headmaster

        Have you ever watched Jeopardy? The point is that Alex gives you the answer and you have to give him the question.

        The point is the answer to the question "what is Chicago?" is not "Its largest airport was named for a World War II hero; its second largest, for a World War II battle"

        If the jeopardy question was "A city whose largest airport was named for a World War II hero; its second largest, for a World War II battle" then the question might be "What is Chicago?".

        1. Destroy All Monsters Silver badge
          WTF?

          That's my point!

          Then why must the candidate wrap it into a meaningless question structure....

          Why not just say "The answer is: Chicago".

          I thought the idea was that it was hard for Watson to find a response that must be the original question. Well no, it's just cheap window dressing, basically adding a question mark the end of the match.

          Here's one that would be interesting:

          HISTORICAL EVENTS:

          "He managed to defuse the situation by promising to remove Jupiter missiles from a third country, but to do so inofficially."

          Is the correct response "What is John F. Kennedy" or "What is the Cuban Missile Crisis"?

          NO! The correct response is "How did John F. Kennedy defuse the Cuban Missile Crisis?"

          Maybe the public can no longer grok the idea of a "question"...

          1. Ian Michael Gumby
            Boffin

            @Destroy all Monsters...

            Historical Events:

            "He managed to defuse the situation by promising to remove Jupiter missiles from a third country, but to do so inofficially."

            The correct answer would be 'What is the Cuban Missile Crisis?"

            You gain the 'correctness' by the context of the category. "Historical Events".

            Were the category "Famous US Presidents", then you would have "Who is JFK?" for your answer.

            Your 'correct' answer wouldn't be "How did JFK defuse the Cuban Missile Crisis" because it isn't an event now is it? Its a matter of Why.

            As its been pointed out, sometimes the framework of the show makes it difficult to answer the question properly.

      2. Ralph B
        FAIL

        @Ian Michael Gumby

        Jeopardy's USP is the A&Q format: The contestants have to find a "question" that matches the "answer" given by the quizmaster i.e. "5,280" ==> "How many feet in a mile?" Note that the question here is correctly formed for the answer.

        Now, "Its largest airport was named for a World War II hero; its second largest, for a World War II battle" =/=> "What is Chicago?" The question is NOT formed correctly for the answer. That is what Destroy All Monsters is "going on about".

        When asked "What is Chicago?" no-one _could_ ever give that answer. To match the question's form that answer would have to be something like "A city whose largest airport was named for a World War II hero; whose second largest, for a World War II battle".

    2. Nick 6
      Happy

      Its bizarreness is the challenge

      Toronto isn't the correct solution.

      And yes the Jeopardy format means some of the 'questions' which the contestants have to get end up being very clunky. I'm not a fan of the show at all.

      But when you watch Jeopardy you can understand why its a very difficult challenge to meet with a computer compared to the ease which humans can do it. There are some youtube videos which explain this well and make you realise how impressive the whole thing is.

  2. Anonymous Coward
    Anonymous Coward

    "relatively small size of the data set"

    100 GB of text data is a pretty frickin huge data set if you ask me.

    The size of the most recent Wikipedia text dump is approximately 31.2 GB uncompressed.

  3. Herby

    Then you get real bizarre stuff...

    Like:

    Category: Hotel Chains.

    Answer: The TV show Maverick won this Emmy award the only year it was given.

    Question: What is Best Western.

    Go figure.

    1. Ian Michael Gumby
      Devil

      @Herby

      Actually that is a really good example of a good question.

      Your clues point to a category the Emmys and then your category points to something else. So you have the intersection of two lists.

      What is the intersection of Hotel Chains and Emmy Categories (past and present).

      1. The Infamous Grouse
        Terminator

        Categories

        I'm still fascinated by the whole Watson phenomenon. I remember noticing, while watching the earlier Jeopardy games, that for those answers Watson got wrong it seemed to be ignoring information in the category in favour of information in the clue. Reading this article suggests that this was by design, which at first seems like a major oversight.

        But then given examples like Best Western, in which a lateral interpretation of both clue and category are required to resolve the question, perhaps it was one of the best choices the programmers could have made. Rather than risk Watson becoming 'confused' at categories with multiple interpretations, better to throw that information away in favour of the actual clue which is usually where most of the usefully crunchable data are to be found.

        Unfortunately this choice meat that while Watson probably would have aced the Best Western question, it screwed up on Chicago.

        I wonder if there's a middle ground on this? Use the category data for the first question and, if it leads to ambiguity and/or a wrong answer, discard it for future questions from the same category?

  4. Anonymous Coward
    Anonymous Coward

    Only 100GB per question.

    This to me says that it's really only a toy; that the algorithms aren't very good but rely on massive data to make up for it. This is not to slag its achievements, just that, well, what human has 100GB data in his or her head? Meaning that we're still better at distilling general purpose information even if it takes us years to assimilate lots of info; we are horribly bottlenecked, but aren't limited to binary processing.

    I don't mind too much machines overtaking us in specialised fields. We are a humanity full of outliers that vastly surpass most of the rest to the point that if you're good enough in your specialisation you end up with only a few peers in the entire world (and probably have more than a few autistic treats, but I digress). The real question is, now that we have it, what do we do with it?

    Just to speculate on the RoTM theme some more: Suppose we do outsource our decision making to mechanical autists like Watson here--assuming that'll happen which in reality doesn't stand much of a chance as our politicians aren't very good now but far too attached to power and votes to let go voluntarily. How do we at the same time prevent dystopia? How is that different from politicians riding roughshod over our rights while we try and prevent dystopia? It's a different environment with different rules but overall it's a similar problem.

    1. ducatis'r us
      Happy

      my brain's bigger than yours then!

      Well if your brain's only got 100Gb I'm surprised you are reading this!

      http://www.scientificamerican.com/article.cfm?id=what-is-the-memory-capacity

This topic is closed for new posts.

Other stories you might like