If you've been watching the first two days of the Jeopardy! game show pitting two of humanity's trivia champs against IBM's Watson question-and-answer machine, you probably had a sinking feeling – mixed with a sense of awe. All-time champ Ken Jennings put up a good fight, and Brad Rutter stole away a few questions, but the …
what is Jeopardy! ?
This hoopla would make more sense if IBM had chosen a subject known outside of the US.
Jeopardy was probably the better choice
As its very tricky... Tricky wording of clues that computers just can't do.
Buildinging a Who Wants to Be a Millionaire bot, by comparison due to the limited number of answers should be relatively easy.
everyone knows Jeopardy!
It's one of the central themes to White Men Can't Jump!
Watson vs Gloria would be a better contest.
Congrats to IBM
Now let's force Watson to play a game it hasn't been prepared for exclusively to see how it does. This is no different then the supercomputer built to exclusive beat Garry Kasparov at Chess.
Wake us up when you figure out REAL artificial intelligence. Personally, I don't think for one second it will ever happen since the ability to reason for any machine is only as good as the programmer setting about to create the rules for how to reason.
Congrats to you, Captain...
...for demonstrating in one brief post how little you know about Watson, computer chess, AI, programming, and human intelligence. Quite a feat!
Watson is dramatically different from chess-playing computers. It has to figure out the meaning of an ambiguous and convoluted natural language sentence, then come up with a context-relevant answer. This is *very* hard, and Watson does it better than the two best humans ever to play the game.
By contrast, computers play chess largely by brute force. Sure, there is a significant amount of algorithms guiding and pruning the tree search, but the way computers play chess bears no relation to the way humans do, and consequently sheds no light on human intelligence.
As for your comment that "the ability to reason for any machine is only as good as the programmer setting about to create the rules for how to reason", that is just flat out wrong. It is a complete non sequitur. Programmers have been writing expert systems for decades that can reason about domains that their programmers know nothing about, and do it better than human experts. You're confusing the rules with the outcome of the rules. It would make as much sense to say "the ability of a robot to weld joints on a car is only as good as the programmer setting out to create rules for how to weld" (and by the way, most programmers are crap welders). This is obviously and demonstrably absurd, but somehow, when the skill under threat is reasoning instead of welding, otherwise-intelligent people go all moony.
As for your reference to "REAL artificial intelligence": what would you define as "REAL"? I'm guessing that it's along the lines of "whatever computers can't do yet"...
find me 10 examples...
of humans that can do that.
heres a hint.... what we call 'learning' computers call 'programming'
AI is inevitable, although i'd suggest it takes more than one second to realise it.
it's relatively trivial to design a piece of code which using say, back error propogation will enable the code to modify it's own answer space based on its experience (you say learning i say programming!) and when such a a piece of code is 'trained' it will have mapped it's inputs to it's outputs ITSELF irrespective of the insight of the programmer.
"Watson does it better than the two best humans ever to play the game."
No, it doesn't. Wake me when Watson wins from a human champ *with internet access* --- because that is essentially what the system has...
The Stoker-clue for example you just get as top google results if you type in parts of the query, so the only thing Watson has to do is figure out "what is an author?" etc.
The achievement is obvious, and the real-world applicability too --- but how many strategic decisions would you let it make? Evaluate proposals, yes; but similarly, would you let an accountant devise a business? No. Indeed, "now go away middle management, or I'll replace you with a very short shell script" (for a query to Watson).
@ AC 12:44 GMT
First, don't be such a pussy and gladly show your face next time.
Second, in what way does Watson actually "figure out the meaning of an ambiguous and convoluted natural language sentence, then come up with a context-relevant answer" better than Brad Rutter and Ken Jennings? I understand that Watson was able to defeat them, but your nonsense about being better at listening and speaking than two human beings is off the idiotic scale. You're such a grammar nazi that you should try reading your own drivel next time.
Third, the limits of any system are defined by the very people that create the system. That's not an opinion I happen to find nice and believable. It's a fact. And while I understand that programmers have options available to account for unknowns, that doesn't change the fact that imperfect people create imperfect systems.
BTW, bravo on the welding example. That actually makes sense, though you fail to understand why. The ability of the programmer to weld is not in question, but their ability to translate welding techniques into code would be questionable. Reasoning is required for the translation and there's nothing absurd about the reasoning required.
If you aren't familiar with the SciFi versions of AI that permeate books and movies, that's what real AI would be in my mind. A system that can reason for itself. That's not programming, that's learning and there's a difference.
It's the timing, and the fix was in
While it is amazing that the machine can answer questions that are phrased obscurely, I think that the win comes down to the machine's ability to press the answer button button at the correct time.
From what I understand, the key to winning is getting the oportunity to answer the question. If a player buzzes in too early, they are locked out for half a second or so.
The computer is always going to be quicker to push a button or flip a switch. I'd like to see the results if they had a person from IBM pressing the buzzer, then let the computer answer.
Let's not forget that this was just a long commercial for IBM that was edited for time, and IBM had home court advantage. I have read that Watson crashed many times during taping.
Let's see it win under normal time constraints, and while "hearing" the questions, rather than getting the exact text of the question.
The buzzer is not the issue
1: You still need to have the right answer to win, no matter your buzzer speed
2: Humans can start to push the buzzer a little early before it clicks - in fact this is what the "best" players do, get into the rythum such that the button is about to cross its contacts as the Buzz in time becomes available - wheras Watson cannot begin to push the button until the question is completed.
3: Given the known lightning fast buzzer reflexes of one of the flesh contestants, Watson was not considered by the producers to have an advantage in this area at all.
An advantage or otherwise was never really the point
Exactly. And even if Watson did have an advantage when it comes to button pressing speed, it would be cancelled out by the fact that it doesn't receive the text input until Trebek finishes reading the question.
All that time while Trebek is reciting, the human players are reading the whole question from the screen. They're already formulating possible answers, determining how confident they are, weighing up the risk of buzzing in against holding back. Watson doesn't get to begin to do any of those things until Trebek finishes speaking and the text is fed in. Watson is fast but the time taken for its computations is not zero. In many cases one of the human's switches will be well on its way to hitting the contacts before Watson has even begun to make sense of the question.
In fact there was strong evidence of this during a couple of the last day's rounds, especially the one about computer keys which, ironically, Watson flubbed badly. In many cases Watson's potential answers appeared on screen a distinct fraction of a second after one of the other contestants buzzed, and quite a few of the answers it had come up with were wildly off base. It actually played so badly in a couple of those early rounds I thought one of the opponents had sneaked into the server room and yanked a few boards.
But we're missing the big picture here. Jeopardy is just a canvas on which this experiment has been played out. You could tweak the rules to benefit the machine or the humans, or you could tweak Watson's algorithms to make it better and better at Jeopardy and witness the law of diminishing returns with each iteration.
But it's not about building a computer that can play TV quizzes, fun though it's been to watch. It's about building a machine that can process 'human-formatted' information in a way that machines have never been able to do before.
And that's an amazing achievement. One which opens up almost limitless possibilities once implemented in other fields.
Imagine a Watson-like system freed of the limits imposed by the game show format. One that could be fed dozens of related queries at once, not just a four- or five-line quiz question. One that didn't have an artificial time limit placed on its ability to calculate the best answer, but could cogitate on it for hours or days. One that wasn't limited to its built-in database but could use the best answers it had already come up with to go out onto the internet and search for more information, refine its answers further and use those to search deeper and so on. The possibilities are staggering.
I cannot understand how anyone, especially anyone reading a technology website, can not be blown away at just what a stunning achievement Watson is. Ever since a teacher brought a ZX80 into my school three decades ago and I marvelled at this little programmable box, I have been following advances in the world of computing and been ever more amazed at every turn. And Watson is by far the single most impressive thing I have yet seen, by a massive margin.
Kudos, IBM. With Watson you have entertained me and you have awed me and I think you may genuinely have shown me a glimpse of the future.
"Does that mean that it's Game Over for humans, that robots will keep us as pets?" Etzioni says no. We say: "Middle managers, get used to the dog food."
Who is the dog food ? -- Watson
So it's all abbout trivia is it?
Why the fsck did I waste years at University studying BSc and thousands on text books when I should have been watching TV and Woman's Weakness?
The reality of AI is that the goals of challenging human intelligence are far too lofty.
Last night I spent 5 minutes chasing down and killing a fly. That little bastard has only a few neurons to rub together yet outperforms any robot in existence on navigation, threat avoidance, aerodynamics and survival.
The AI people would do far better to start by trying to emulate a beetle before they try to match humans.
My understanding is the next goal for the Watson Project is to turn the "Normal Human Language" processing ability into a medical diagnostics machine.
As someone who works in tech support, I could see this thing doing my job one day, when its perfected a bit more.
Obvious joke alert
There already are humans kept as pets by machines - they're called "iPhone owners"!
I'm here all week, unfortunately.
Watson vs. a 10 year-old + internet?
OK, so a computer plus the massed brains of IBM beats merely the massed brain of 1 person. Fair enough. However what would be a more interesting comparison would be to pitch this (presumably standalone) machine against a reasonably net-savvy child with a search engine.
I've never seen Jeopardy (is it an "only in america" thing?), but it seems to revolve around the contestants extracting a subject and some keywords from a "clue" and then solving the subject from the greatest correlation of keywords. In which case it needs a large knowledge base, some experience of how to extract the salient elements of the question and a strategy on deciding your confidence level / how much to bet. That's where the future lies.
Presumably tv-watching nerds already play along on their computers while watching the show. I would be surprised if they didn't wipe the floor with "ordinary" players - even the best ones.
What exactly would that prove?
The goal here is to try and make a computer, as much as possible, understand plain human language. A 10 year old already has a reasonable grasp of the same.
Only that the requirements to win are a 10 y/o's grasp of english and access the a large body of information.
Given that yer average tabloid has a "reading age" target of 10-12, then a computer with those abilities should be able to answer most questions that a tabloid reader would be likely to pose. Though whether a computer would be able to simulate the hysterical moral outrage that tabloids have made their own is a more interesting issue..
Q & As
Humanities humiliating defeat, questions, answers and graphs here http://www.j-archive.com/showgame.php?game_id=3577
Where is our Mycroft in our hour of need? :)
I'd rather have Mike than Watson
Personally, I'd rather have MYCROFT than Watson.
Mine's the one with "FREE LUNA! TANSTAAFL!" on the back.
Brace yourself for the torrent of commentators, here and elsewhere saying "oh, it's only a general knowledge quiz, obviously computers will be good at that..."
Like they said about chess, in fact.
AI has been a much harder road than anyone imagined, and Watson is just a step. But the multiple competing algorithms, the just-deep-enough linguistics, and the relatively huge knowledge base are real forward moves.
If I was a mad genius, I would now be looking at a Watsonesque approach to systems that can infer stories about our inner lives from knowledge of our actions. The knowledgebase for that is a bitch, though.
Not entirely accurate
Considering there were two humans and one machine it is a bit unfair to measure humanity against the machine and only factor in the score of one human. If you combine their scores and increase the final jeopardy wager of Ken Jennings to 19K then things get a lot closer (although the machine still wins).
he Skynet Funding Bill is passed.
The system goes on-line August 4th, 1997. Human decisions are removed from strategic defense. Skynet begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.
That's the right one:
But I miss you so much till then
I met someone who looks a lot like you
She does the things you do
But she is an IBM.
Tech not ready....
Despite its amazing win... I don't think the tech is quite ready to start replacing people as researchers and call centre workers... There are some subjects where it clearly had no clue (Name the Decade, keys on your computer) and others where the question really had to be worded right (the Day 2/Game 1 Final Jeopardy question from the US cities category it got wrong, however I heard they refed the question later including the words "US City" into the question and got it right).
I confess that I'm intrigued by the wagering Watson did on Daily Doubles and Final Jeopardy!, which were quite a source of amusement. I'd like to know how he came up with those amounts.
My guess is that he calculated them according to some formula based on his confidence level for the category and his current score.
a bit of whimsy?
I too was intrigued by the odd ball amounts. I suspect there was a bit of whimsy employed by the programmers in that algorithm. Like: the number has to end in 7.
But I'd be pleased if someone with actual knowledge spilled the beans.
Watson is quite probably related to IBM's old TREC-QA systems. For TREC-8 (in 1999), they used a combination of shallow parsing to get search keywords from the question and to extract typed entities (names, roles, lengths, ages) from the document corpus. That system filtered the search results to find answers appropriate to the question - a question starting "Who ...?" clearly wants a name in response, "How far ...?" will be a distance, etc. It was one of the more effective attempts, but 'understood' far less than competing systems. (Microsoft took that to an extreme in a later year and produced a QA system which searched Google for the answers on the basis that someone somewhere will have used the exact phrasing, then looked in the corpus to find them, understanding absolutely nothing)
Adapting such a QA system to Jeopardy is an interesting summer project. It's different since you don't get given the question - but you do get given a category and questions often start with "this <type of answer> ..." or similar. And then there's the betting strategy involved with Daily Doubles.
You can see where Watson has had a lot of problems - categories that involve puns in the answer ("'church' and 'state'"), categories with a specific connection it can't resolve ("Actors who direct", "Also on your computer keys"), groups of years ("Name the decade"). Ironically, Watson could probably have answered the much harder "The year the first modern crossword puzzle was published" better than the looser "The first modern crossword puzzle is published & Oreo cookies are introduced" (to identify a decade). Those can be fixed, of course, but then you're just building a machine to play Jeopardy slightly better, not actually solving anything.
SUBMIT POST: IBM ANSWER MACHINE MAKES CHUMPS OF TRIVIA CHIMPS
I've read this article and the other one and I still have no idea what Jeopardy is. Some links in the article might have been useful. I googled it and found a show from 2004? Is that what you're talking about? The show that was on tv 7 years ago?
The gameshow is called Jeopardy! (note the exclamation point) and originated in the '60s. Alex Trebek has been its MC since 1984. Visit http://en.wikipedia.org/wiki/Jeopardy! for more information.
- Review Ubuntu 14.04 LTS: Great changes, but sssh don't mention the...
- Vid CEO Tim Cook sweeps Apple's inconvenient truths under a solar panel
- Antique Code Show WTF happened to Pac-Man?
- HTC mulls swoop for Nokia's MASSIVE Chennai plant
- Study shows dangerous asteroid impacts hit Earth every six months