Speech recognition has been a technology coming of age for an age. It got a shot in the arm recently with the launch of the iPhone 4S, where the S stands for Siri, the speech recognition company Apple bought. Siri may be trendy, but the most mature technology is on the PC and comes from the company Nuance bought. Nuance Dragon …
Does is normally drop in price over its lifetime? Or is the previous version available cheaper?
As things stand I don't think I could justify this.
The professional version is, but I have bought the home version for £40 on an email offer from Nuance without headphones on a download recently. We have never needed more than the home version to be honest. It is extremely effective for long tedious sessions of technical writing, especially if you have a clue about what you are going to say.
I find though, that your mind has been trained to produce text at the cadence at which you type. So using it for a completely creative work, when you are thinking about the text as you go along results in the same pace of development whether talking or typing.
I do rate it though, and will never go back to not having it.
Re: "Expensive" If what one wants first and foremost is accurate dictation/voice command......
................rather than the full-song-with-choruses integrations provided by the premium version then the "Home" edition is fine and reasonably good value for money. It also occurs to me that as tablets get more powerful it might be a useful facility on an x86 tablet for knocking off a (relatively) large amount of text input when on the move without the painful experience of trying to do that scale of input via a virtual keyboard. Something like Sammy's 11 inch "Slate" series or the Surface Pro should cope perfectly well with a program like this. A combination of voice and a modern stylus, for example, could be very effective indeed. Being able to choose how one interacts with a piece of kit (as the kit is now getting way more powerful than it was) whether it's voice, touch, stylus or keyboard/mouse or any permutation thereof that suits may produce some very interesting developments in the way we actually use our devices.
I'm very interested in this. I'm writing a book at the moment and sometimes my old RSI war -wound comes back to haunt me. This might be a solution.
Can anyone advise how punctuation works? Do I have to say "comma" or will it insert one if I leave a small pause in my speech? Full stops?
Is it contextually aware? For example, if I say "there" will it use the correct version (there, their). What about to, too, and two? Is it able to determine the correct version to use depending on the context of the sentence?
Does it know how apostrophes work?
"This is John's phone"
Just curious about how much I have to explicitly tell it, and how much it can work out for itself. For example, I were dictating to a human, the human could work it all out for him/herself.
Thanks for the review of this interesting product.
I use the predecessor (11.5) to help with my incipient RSI as well.
Punctuation you have to spell out. It'll likely have a good crack at some commas in lists say.
Yes it's contextually aware. That's the whole point of the software and where it gets its reliability from. It also knows how apostrophes work.
The important thing - which this "review" completely missed - is that the program is adaptive. This is also the big difference between it and cloud versions (as though anyone cares where it runs!). When you use it it will make mistakes. You'll be tempted just to go and correct the errors using your keyboard. Don't. Instead you need to invest the time to correct mistakes using DNS itself. Then it will learn what you meant which increases the reliability hugely.
I've been using DNS 8 for what must be getting on about 10 years now, on various machines and I'm replying to you using it. These days I had to run it inside VMware because for some magic reason 32-bit Windows software won't run on 64-bit Windows.
I couldn't manage without it.
The article says that you should use high-quality microphone, my experience is that £60 headphones aren't very good whereas a £15 rubbish headset with a little treat [that should be battery] booster box works considerably better. A quiet room is essential, and I have found that it is a lot easier to dictate while looking away from the screen because otherwise you are reading your own words appearing and there's some kind of feedback loop which screws you up -- to tip, that.
To answer your questions, there is a mode where it apparently can guess the punctuation based on your hesitations, but I never tried it. I tend to explicitly speak the punctuation like this:, ', ",;, -- these are all spoken ("comma", "apostrophe", etc). When you first get going you have a strong and conscious temptation to say "hello comma Jim" in normal conversation. Quite disconcerting, it goes away fairly soon.
It can certainly make a good guess at what you intend by context, although most regularly for me confuses 'to', 'too' and '2'.
Some examples in context (direct and uncorrected)
"just the two of us"
"this is John's phone"
Harmony and confusion is inevitable [that should have been homonym confusion is inevitable], and you really must carefully proof read any e-mails you send out, the weirdest and sometimes slightly offensive stuff can appear.
The claim of 99% accuracy, or whatever it is, is always going to be complete rubbish in less [unless] it's the most ideal text spoken most clearly under the most perfect conditions.
Anyway, it's a bloody miracle for anyone with RSI.
It's even possible to program with it, albeit painfully slowly. It's certainly not designed for it.
The above was almost completely done with Dragon, with a few corrections by hand (other than the ones I've marked).
Please could you say which is the " £60 headphones [and] £15 rubbish headset with a little treat [that should be battery] booster box"?
Posh headphones were andrea noise cancelling jobbies ANC-750. Was so bad the company assumed it was a fault and send a 2nd pair - same result.
Cheapo ones which work much better (though not fantastic) are labelled Genius, no model number (though they have an L on one side and and R on the other).
The battery booster box came with the andreas and give a little extra juice to the 'phones in case the USB port doesn't provide enough. Takes a pair of AAAs.
Incidentally, when I gave the 'John's phone, Peter's cat' example above, the apostrophes were inserted automatically. You do have to say them explicitly if you want them around something 'like this' (you'd say "open single quote like this close single quote"). Also the use of two in 'just the two of us' worked because it recognises the word in context (I think it uses the two preceding words to guide its choice of the third, or maybe one before and after. Something to do with hidden markov models or summat).
it does do all of those and for us dyslexics its magnificent.
I've been using Dragon 10 for a while now and may upgrade to 12 with a bluetooth headset.
My daughter teaches dyslexic students and uses version 11 which she is very pleased with.
Is that from this company?
£40 for Home
£90 for Premium
Has it ever improved on the broad Scots accents ?
I usually make less mistakes typing than DNS does translating even though I generally speak clearly and try to enunciate / articulate correctly. ( I have tried giveing a little tiny pause between words as well).
DNS for me is one of those Silver Bullet products that never managed to make the grade, although I understand that for some people it can/might work as annonced.
"I usually make less mistakes"
You mean "fewer mistakes".
Theres an app for that
Ubuntu speech input on Google play....
Nuance == it's all about the bottom line.
Shame Nuance decided to shaft the people using a python based extensions which is a million miles better than the "so-called" visual basics scripting that comes with the more pricey version. Nuance never stated that they'd be supporting the extension (known as NatLink) but it's hardly fair to "un-intentionally" provide support for something for 10+ years (while people spend hundreds of hours of their free time improving it), only to stomp out vital parts of the API with the release of version 12.
If you want good speech recognition, a great scripting language and tons of useful features not included with DNS I'd say go buy the cheaper version of 11.5 - I' personally have no intention of touching version 12, or future versions until Nuance provide re-instate NatLink. Or failing that, provide a scripting language that's not ancient and at best, useless.
It's been around since Windows 95...
...and I hope it got better.
Back then, a two-stroke bike with open muffler screeching by my window would cause a mess on my dictated text. And my ears.
Really, no kidding.
It is Bill Gates dream since forever (make a computer more user-friendly with easier UI), and you wonder why they (at MS) never got a shot at implementing this in any Windows version since then. IBM even tried to do this back then also, and they dropped it rather inconspicuously.
I would take this version with a pinch of salt, like all previous versions. And try this with a USB-powered microphone-earphone (where the system recognizes the mic-earphone like a complete separate sound card), because these tend to sample your voice at very high rates, like 11kHz or 22kHz, which is way more than the 3kHz sampling rate alloted for cell phones.
I repeat, all those figures were important back then, they ought to be now.
And try it in a really, really quiet room.
Re: It's been around since Windows 95...
Actually, going by the people using IBM's ViaVoice it's still going great - even several Windows OSes later. It's a shame noone can buy a new copy it as IBM exited the personal SR market, selling ViaVoice to Nuance. (And yes, you can guess what Nuance did once they owned ViaVoice... :( ).
[With a good, noise cancelling microphone, the background noise is less of a issue these days]
When you're an OS X User, you forget that Windows user have to pay extra for things like this...
I am not surprised you posted that as an AC.
Re: Trouble is...
Arrant assclown tosh. Dragon has to be purchased separately for Mac just as it does for PC. If the inbuild voice recog in OSX was enterprise quality dictation software, god knows Apple would shout about it, but it isn't, and they don't.
Only game in town?
While I have no idea which is better, Windows does provide built-in speech recognition and a speech recognition API. It's far from perfect, but so long as you get the position of the mouse grid memorized, it works just fine. Doesn't seem to have ever been updated from XP though, but it still works a lot better than Hey You! Pikachu ever did. [In regards to the claim that Dragon is the "only game in town."]
Re: Only game in town?
Microsoft Windows has had speech recognition as standard since Vista; on XP it was only with the Tablet Edition or included with some Microsoft Office editions, and with some third-party products - anyway, forget that. And the Windows 7 version is quite sophisticated for elements on screen that you can verbally "click" - although it works better with Microsoft applications, such as Internet Explorer, than with third-party products - and I didn't find a scripting mechanism.
Does anyone, including Nuance, know an easy way to use Dragon with Ubuntu? (many people do not trust Microsoft after the Vista con)
Re: Windows only?
"Platypus is an open source shim that will allow Dragon NaturallySpeaking running under wine to work with any linux x11 application."
There's more stuff like that described there, I don't really know what any of it means but there is stuff.
I found this review really hard to read, with the author constantly referring to the software as "DNS". I had to constantly remind myself that he was talking about speech recognition software, and not what your average techie would mean by that abbreviation.
The best just got better
I recently took advantage of a special offer to upgrade my Version 10 Preferred to the new V12 Premium, and I am very, very happy I did. DNS just keeps getting better and better. As the effects of my cerebral palsy continue to degrade my motor skills, the continued improvements in Dragon make it ever more valuable. The accuracy out of the box on the new version is amazing, although it still struggles with "two/to/too". To have it recognise both my Kiwi accent and learn all the Hindi and Panjabi words I use is very impressive. For anyone with issues that make typing difficult, I definitely think DNS Premium is worth the money. There is no reasonable comparison possible with the built-in speech recognition in Vista mentioned above, that's for sure, and Dragon's excellence means I have no inclination to see if Win 7's SR is any better than its predecessor's.
Tricks for Windows built-in speech may apply
Perhaps you're ahead of me here, but I found that for Windows' own speech recognition, I needed to train and then use pronunciations of "a" and "the" that rhyme with "hay" and "bee", which isn't what I normally say, and to invent a word pronounced "perry-odd" but spelled "period", whereas saying "period" made a full stop. So you could try sounding the letter w in "two", and make more use of the word "also".
I have a theory that as electronic speech recognition grows, pronunciation, accents and dialect will converge towards what a device will actually recognise, and finally all of us will talk like Siri. Or else like the Indian or Chinese equivalent, since that's respectively where call centres and our IT manufacturers are located.
Siri and OS X's speech recognition uses servers running software developed by...
Nuance's free iOS [url=http://itunes.apple.com/us/app/dragon-dictation/id341446764?mt=8]Dragon Dictate[/url] app predates both the OS X dictation and Siri features by a year or two, so may well have been something of a testbed for their cloud-based speech recognition software.
Mac users therefore get the dubious joy of paying £200 or so for an offline version of the software with a nice GUI. The underlying engine is the same.
Dragon Dictating is so GOOD, I actually go and buy it
Living next to the big DVD consolidator called China, I usually pick up a few DVDs each time I visit NanNing, GuangXi Province.
However, there is one software package I always buy in original sealed packaging is the Nuance Dragon Dictating, mind you I get it at trade prices.
Recommended - as is their Android App.