A simulated talking head that can apparently express human emotion has been created by engineers in Cambridge, England. Think Red Dwarf meets, er, Hollyoaks. The computer system called "Zoe" can generate voice and facial expressions from typed text and - more importantly from a commercial perspective - can potentially be used as …
Will certainly be interesting to see what games do with this
Was playing Mass Effect 2 recently and was struck how "forced" all the facial expressions looked. Very disappointing, given the fully pre-programmed nature meant they really could have done better than "smile/neutral/grimace".
So something that lets them cheat would help a lot for making believable characters.
Although nothing can save them from such terrible dialogue...
No Max Headroom version?
Did I just see the word 'thesp' used to describe a Hollyoaks actor?
Hmm, we have come to expect some semblance of accuracy from the Reg
I see no need for this.
Or, in the vernacular, "do not want".
Re: I see no need for this.
Speak for yourself. I want a Hollyphone.
EDIT: That's a Norman Lovett Hollyphone, for preference but a Hatty Hayridge one would be cool too.
..her eyes are focused just above your head - the urge to look up is very strong in this one... it's like Skype with the un-initiated.
This reminds me of "Box" from Star Cops?
I think Star Cops is still very technologically accurate. Apart from the space bits ;)
Star Cops had a digital assistant called "Box". Apparently the original idea for Box was that it would use the voice of the owner, although that was dumped during filming.
So the original Box idea was that it should ring someone and basically impersonate their owner.
Gee, it's come around again. Anybody remember the 'headcasting' application that once shipped with the Matrox G550 graphics card? This seems to be the great-grandson of that technology, 12 years on. It didn't catch on then, and I'd be surprised if it does this time around.
If this can be developed, then a user could, for example, text the message 'I’m going to be late' and ask it to set the emotion to 'frustrated'. Their friend would then receive a 'face message' that looked like the sender, repeating the message in a frustrated way.
That sounds just... awful.
It is, ... absolutely awful
I do wonder if it would not be simpler to, you know, insert a camera into the phone, and, like, image or film the actual expression of the sender rather than simulate it, but perhaps that technology must first be developed
Can the software please add a sarcastic expression to the above statement?
It is of course handy if you want to send an insincere emotion, but that would of course be unethical, so nobody would do that
^ Same emotion again, please
Re: It is, ... absolutely awful
On BBC News this morning they showed "her" in angry mode shouting "You're late! Where on earth are you?" in a way that I found really quite unnerving. I suspect anyone feeling a little harassed would not want one of these.
Douglas Adams thought the idea ludicrous 30 years ago, and this version is no better
This sounds oddly reminiscent of a Douglas Adams interview I once read, that was given around *30* years ago:-
“[MIT] were showing me some research they were doing on video telephones. They reckoned that everybody has a number of people they regularly speak to on the telephone at your telephone you could have a small computer, storing video pictures of those people. When somebody rang you, a phonetic program would find the right picture and move the mouth in time with the words.
“They were very pleased with this [but] if you look at that logically you’ll see that this is not increasing communication — it is actually decreasing it.
“If you talk to somebody on the telephone your attention is concentrated on what they are saying. When you talk to somebody face to face or even on a television screen you get the message partly from their gestures and the expression on their face. But if you are seeing a picture which is not giving you any additional information the two impressions are totally contradictory.
“If someone rings up to say ‘Oh God, I’ve just gone bankrupt’ or ‘My wife’s run off' and you have this bright, smiling picture with the lips moving in an utterly grotesque way, it is not actually helping you to understand what the person is saying."
“The whole project is ludicrous and self-defeating but I couldn’t get the researchers at MIT to understand that.”
Now, this present-day equivalent is- in theory- attempting to match the mood of the person to the head. But in practice, it's still closer to Adams' ludicrous example above. It's applying a generic, pre-packed, pre-defined expression to the face that will convey none of the subtleties of *your* real expression and how that conveys your *actual* emotions. In short, it tells you nothing more than a single word or phrase covering your mood would.
In this way, despite its superficial improvement over MIT's early-80s example, it has *exactly* the same problem- it actually *decreases* communication by distracting from the content with misleading visual content.
Re: It is, ... absolutely awful
"On BBC News this morning they showed "her" in angry mode shouting "You're late! Where on earth are you?" in a way that I found really quite unnerving. I suspect anyone feeling a little harassed would not want one of these."
And anyway, isn't shouting at you that way the privilege the missus?
Re: Douglas Adams thought the idea ludicrous 30 years ago, and this version is no better
That's only in that particular use case. Where I think this might be useful is in human-computer interaction, for example, giving Siri a face or making characters in computer games more realistic and variable.
Re: Douglas Adams thought the idea ludicrous 30 years ago, and this version is no better
Wasn't there an Iain M Banks novel that had a message being sent as a transcript of the sender's mind-state so that the recipient could question them for all the details that were necessary? I remember the feeling of distaste when the temporary mind was destroyed after the message had been delivered...
couldn't they have used better photos ? I mean - the colour is really ... drab.
What happened to usefulness of this project?
Its dead Dave.
What? What about the cool factor of having Siri with a face? What about that?
Its dead. Its all dead Dave.
Doesn't anyone want to scan their head and send it to all their mates so emoticons come with their own face? That's a great idea!
Its dead. Its all dead Dave...
This seems utterly pointless. We have plenty of messaging options for things like this already.
If I need to let someone know I'm late using my mobile phone I can:
1. Send a plain text SMS message
2. Send a Picture and text MMS message
3. Send a short video as an MMS message
4. Make a voice only phone call to them
5. Make a video call to them
Not to mention the myriad of social media and e-mail options.
The thing is, when was the last time you did anything other than use a text only SMS or a voice only call for a message like "I'm going to be late"?
If it was for a longer conversation than that, this technology would essentially be replacing human interaction with digital avatars.
Researcher mistake #1 ...
... being drawn on the question "but what is it for?"
The answer they gave, an animated message avatar, was presumably an off-the-top-of-the-head remark which has been rightly ridiculed. You're better off at one of the extreme ends of the groovy-boring spectrum - either: "Because it's cool, the use of this is limited only by your imagination" or: "to determine whether emotional facial modelling could be achieved in a smartphone-sized application"
I always liked the answer, "this is the research department, ask the chaps over there in development".
Once you have deflected the question the reporters and commenters will do your work for you: Richard12 probably nailed it in the first comment --- animating NPCs in video gaming.
Re: Researcher mistake #1 ...
it's basically a very complicated and expensive smiley?
My thoughts exactly -- it could be developed that if you send a text with a smiley it ought to be able to change expression accordingly.
Though a slipped finger could prove embarrassing if not careful when expressing your "sincerest and deepest sympathy on the death of your cat ;-p".
What is it for
1) NPC's in games
2) SIRI with a face - costumize your digital assistant with the face of your favourite actor/musician/polititian :-)
3) Kiosks with a friendly face - for everything from buying a soda to subscribing cable TV or searching for tourist sights
4) Automated contact centers - with the new HTML extensions, soon every browser will be able to make video calls, and the "Click to Call" or chat will have you talking to Zoe
Automated call centers with a friendly face
Re: What is it for
#2 I don't want Siri to have a face, I want them to licence HAL or GLaDOS. Make me happy already you Apple bastards, same goes out to Google, get either demented AI as the face of Google Now and you can creep through my most intimate of dox to your hearts content.
The other way round
> “Present day human-computer interaction still revolves around typing at a keyboard or moving and pointing with a mouse.” said the University of Cambridge's professor Roberto Cipolla. “For a lot of people, that makes computers difficult and frustrating to use. [..]"
For me that is precisely one of the advantages of computers... Voice communication is just too ambiguous.
And the idea of a photo-realistic computer-controlled face image is just too weird. Especially if it looks like some real person. Think of the interesting possibilities of misusing it. A video of me apparently saying stuff I would never agree with, or worse...
Nuke this invention from orbit before it is too late.
"The team is now looking for guinea pigs to test the system"
I'm not sure that many guinea pigs have 'phones......
My chinchilla Jeff might be interested, though. I'll ask him when I get home (although he does have a record of attempting to consume consumer electronics).
Reminds me a bit of this:
But sort of the other way around, mapping other faces onto your face in real time.
Still very robotic
It doesn't seem a million miles away from the old Ananova virtual newscaster lady from about ten years ago with a little dash of the nVidia bimbo-with-slider-adjustable-face thrown in. Still very stilted, but a big improvement on any performance in Hollyoaks.
The sentinel on the Moon was detecting that 99% of Earth's data transmissions were total crap and then this tipped it to 100%. <Upvotes P.Lee then runs away>
Had to go looking for a video
What if someone hacks into my avatar and uses it to make video calls that people think are from me?
That means I can make rude calls myself and then deny everything!
In fact, this isn't me now! As far as you know...
I'm dissapointed El Reg commentards.
24 posts and no mention of how Cambridge Engineers give good head.
This is an SOS distress call from the mining ship Red Dwarf.
The crew are dead, killed by a radiation leak. The only survivors are Dave Lister, who was in suspended animation during the disaster, and his pregnant cat, who was safely sealed in the hold. Revived three million years later, Lister's only companions are a life form who evolved from his cat, and Arnold Rimmer, a hologram simulation of one of the dead crew. I am Holly, the ship's computer, with an IQ of 6000, the same IQ as 6000 PE teachers.
a user could, for example, text the message 'I’m going to be late' and ask it to set the emotion to 'frustrated'
So much simpler than texting the message "I'm frustrated because I’m going to be late", or "I’m going to be late. How frustrating!"**. Or even, perish the thought, phoning and sounding frustrated.
Also, I'm a bit concerned that the face might express, say, sexual frustration, rather than mild annoyance at being late.
** OK, Shakespeare, that's enough emoting.
What do you mean you didn't know?
My phone's talking head rang your phone's talking head and set up the appopintment.....
Re: What do you mean you didn't know?
That's O.K., we'll just send our android avatars to the meeting instead.
I just hope they don't elope.
I want queeg!
lets have a funky gym instructor type.
It has a six in it, but it's not 6,000
I assume sooner or later they will come up with 'looks sick' and sounds honest so we can book a day off and fiddle benefits.
I have watched Red Dwarf of course...
...but I have only ever watched the last three minuets of Hollyoaks - I'm sure other fans of Channel Four News will know the feeling!
Paris because I'm pretty sure she is an extra in Hollyoaks!
RE: Very disconcerting...
"..her eyes are focused just above your head"
The male version (for women's phones) will be gazing down.
Voice is crap
Considering the quality of some text to speech systems where the voices sound very natural, "Zoe"'s is very robotic.
Holly's OK, but I like Mike
Having Holly would be OK, but I'd rather have Mycroft/Michelle/Adam Selene/Simon Jester for my phone.
If for no other reason than being able to order a "delivery of rice" to certain people and/or locations....
Upvote for reference to "The Moon is a Harsh Mistress"
Forget about the graphics
I got a completely different impression of what this research was about when viewing the BBC video clip versus reading this Reg article.
Everyone seems to have latched on to using this technology to make a text message spoken, which is pretty pointless.
The first commentator was bang on the money. The interesting thing about this piece of research is that it is taking written text, and then using a form of markup, it producing a face that reads that text with realistic facial expressions and voice tonality. Sure, plenty of characters in video games do this already (while having more realistically rendered faces), but with those characters the facial movements and voice where recorded from an actor in a motion capture studio. In this case the facial movements and voice tonality are being procedurally generated. Do this realistically enough and you won't have to get an actor to pre-record everything you want a character to potentially say.
So this research is basically looking to improve the 'speech output' side of computer interactions/games. Good luck on solving the speech input side though. We'll probably be using dialogue wheels for input for a while yet.
Now now now that would be ca ca ca cool
Re: Max Headroom
"Now now now that would be ca ca ca cool^W^W".
There. Fixed that for you. You are quite welcome. No charge.