Registering and logging in removes this ad.
Registering and logging in removes this ad.
(Very) Basic Training
Thanks for the informative comments by Lunis, Chuck, and others on the thread regarding the pros and cons of doing additional training.
After having read and reread the postings and the book excerpt, I have a couple of very basic questions and comments:
01. Since microphone and hardware configuration come up repeatedly in discussions about training for accuracy, I am curious to know what level of sound quality experienced users typically expect to achieve after running the Audio Set up Wizard?
I am currently using an Andrea NCS-700 headset microphone through a Buddy USB G6 pod on a Toshiba Satellite A215 with 3 GB DDRAM and AMD Turion 64 x2 at 2GHz each, running Vista SP1 with Ready Boost activated.
With this set-up, I typically get a 16 or 17, and DNS will accept a 15, without flashing the warning screen that sound quality is unacceptable and will adversely affect voice recognition performance.
Do experienced users get in the high teens or even low 20s?
And if so, is that attributable to higher-end hardware, superior diction/enunciation, better input volume level, etc.?
For instance, when I finish the Volume Check, I find it works best to have the indicator bar about in the middle of the grid.
Is there any evidence to suggest that adjusting the microphone position or speaker's voice volume to register somewhere higher or lower than the midpoint of the scale makes any difference in ultimate sound quality and/or voice recognition accuracy?
Interestingly, when I am reading the Quality Check text, if I speak some random word substitutions, I can actually increase the quality score by a few points, as compared to when I read the words exactly verbatim.
This may be just a curiosity, but it does raise for me the question of how important the sound quality number really is, if the number can be increased by purposely mismatching the spoken sound bite with the displayed text.
It may be that the actual number is less important than whether DNS deems the overall sound quality to be acceptable or not (i.e., is this really a pass-fail grading system?)
02. What criteria should be used to determine whether to merely train a word (or, as recommended, a phrase) vs. also adding it to the vocabulary?
That is, is it better to just train a word/phrase by using the Correct function on the fly or the Train function under Words, or is it better to train and then add the word/phrase to the vocabulary using the Vocabulary Editor and/or the Add Words to Vocabulary from Documents function?
Specifically, does sequentially adding misrecognized (with their correct pronunciations attached) into the vocabulary increase speed and recognition accuracy in the future, or does adjust clutter up the vocabulary with matched strings of sounds/text that may or may not be used frequently enough to justify their inclusion in the vocabulary.
My inclination is to add to the vocabulary only those "pet phrases" that will likely be used often enough to justify their inclusion in the "short list" of things that I am likely to say on a repeated basis in the future.
And to do just the training-only functions on mis-recognized words/phrases that I am not likely to use again.
Theoretically, by adding frequently-used terminology to the vocabulary (and by deleting unused or troublesome entries from the vocabulary), I should be able to "shape" my user vocabulary to represent more of what I am likely to say, and less of what I am unlikely to say in the future.
It is my understanding that the Medical Large vocabulary contains a static 160,000 entries.
So, it would seem that the faster that I can add words and phrases to my user vocabulary, the faster the unused entries will be bumped into the inactive vocabulary archive, which should shorten search-times and improve speed/accuracy.
However, if this assumption is wrong, I would value the guidance of more experienced users on how to build a more functional vocabulary and voice file, since it seems that a lot of time and effort can go into training that is of little benefit at best and harmful at worst.
Thanks for any insights that anyone can provide.
Mark



PS
I am running DNS v9.5.
Sorry to have neglected to include that piece of info, since it may well affect what to expect from training.
Mark
I'm no expert but I've been
I'm no expert but I've been around a while...
Re: 01...As you guess, its more of a binary than continuous system. I get 18 with the Sennheiser ME3, which is considered one of the best microphones available for SR.
What you are probably doing when you substitute random text is pause more between utterances, which seems to increase the score. Speaking louder than the level you trained in the initial phase also seems to increase the score. I don't think I've never produced a value larger than 21-22.
When I say words or phrases that aren't in the vocabulary, I add them to the editor, once I've gotten the correct text, by saying "make that a phrase" -- seems to work okay.
You can try training if the recognizer constantly gets it wrong but its in the vocabulary, but I this is not an especially productive exercise. Finesse in adding phrases or words is probably the most efficient way to get the correct text out, especially by using the written/spoken form template.
To tell you the truth, I wouldn't worry too much about a global strategy -- users who try to overthink the product tend to drive recognition into the ground. Just attend to the details and let the product do its thing. It seems to be the smarter, more technical types who are more prone to overthinking, which if of course how they usually get ahead
HTH,
Bruce
Thanks
I agree with and will take to heart your comments about not over-thinking this VR process.
I just want to avoid having to re-do my use files, because of corruption that could tend to sneak in by counter-productive training or vocabulary additions.
I always regret not
I always regret not reviewing my comments before posting them, and that's the case with my "overthink" rant. In daylight, I've reviewed your questions and they seem reasonable and not indicative of overthinking -- you are asking the questions most of us asked.
But I will re-iterate that training is not a productive way to increase accuracy. More important is to dictate relatively long stretches of speech so that DNS' algorithms can have more power via deeper analysis. Similarly, when you correct an error, select one or two words on either/both side(s) of the error in order to correct in context, i.e., try to select a "natural" phrase containing the error -- that's how DNS learns best.
Please don't hesitate to post more queries just because of one crank.
Bruce
No regrets
Bruce,
I did not take your comments to be a rant, in the least.
In fact, you helped me to "get back to work" on the project that I am theoretically using DNS to expedite, instead of getting sidetracked by the "nuances" and neatness of the voice recognition software, itself.
Sometimes, another's perspective helps to remind Re the distinction between the task and the tool, the end and the means, etc.
When I have completed my project, then I can reward myself with some more tweaking time, indulging my curiosity.
And thanks for your follow-up comments and suggestions.
Mark
Over Thinking it
We suspect you are using an ANC 700 headset
microphone rather than an NCS-700 which we've never heard of. We consider this
microphone to be on the lower end of the best microphones but it's certainly
good enough to use the NaturallySpeaking.
We don't think you'll appreciate any advantages from utilizing ReadyBoost because
NaturallySpeaking loads almost entirely into memory.
An Audio Setup Wizard score between 16 and 19 is typical for the ANC 700
microphone. The ANC 700’s Active Noise Cancellation lowers the overall score a
little bit but has no effect on the quality and the thing you need to
understand is the Audio Setup Wizard score has very little basis in reality so
forget about it. We have seen over-the-counter microphones score as high as 28
but that doesn't make them anywhere near as good as the ANC 700. What counts is
how many errors you get per 100 words and that's all that counts (aside from amenities
such as pricing, comfort, noise cancellation etc.). If you'd like to see how
your ANC 700 microphone compares to other microphones check out our Microphone
Comparison Matrix.
You can also artificially raise your Audio Setup Wizard score by speaking in a
falsetto but it won’t improve your accuracy. Think of the Audio Setup Wizard score
as a pass or fail test.
The criteria as to whether you should train a specific word or add it to a
phrase depends on your usage. Only you can really answer this question. For
example, we found no advantage in training the word “pod” because
NaturallySpeaking simply didn't handle it well but we almost always use this
word with another word which is “USB”. When we added “USB Pod” to our personal
vocabulary that phrase became bulletproof. It's much faster to combine a word
with a phrase and add that phrase to your vocabulary, especially if you're
using KnowBrainer
2007, then to train it but if the word is very common and can appear in too
many phrases, it would only make sense to resort to training. Just use your
best judgment. The DNS Medical vocabulary probably contains about 165,000 words
but we think you might be over complicating the issue if you think you're going
to add lots of words and phrases that will bump out some of the unused words. You
will need to add 10,000 words to your vocabulary before you can bump out the 1st
word. There are simply too many words and it's unrealistic to expect a
significant performance gain. This is one of the reasons why KnowBrainer
includes a 50,000 word vocabulary option. However, you would probably be
surprised at how many missing words we had to add back into the vocabulary. Your
working way too hard on this. We recommend letting NaturallySpeaking take care
of itself and when you run into an obvious vocabulary problem, deal with it via
the Vocabulary Editor as a phrase, training or possibly utilizing the Written
Form/Spoken Form. In some cases, even the special Properties button may apply.
Lunis
Orcutt - Developer of KnowBrainer
&
Host of the
Http://www.TheMicrophoneStore.com
A Nuance Gold Certified Endorsed Dragon
NaturallySpeaking Vendor/Trainer
ALWAYS Ask If Your Speech Recognition Vendor Is
Nuance Certified
Overthinking
Lunis,
You are correct, the mic is an Andrea ANC 700 (I'm not sure where the NCS came from...)
Anyway, I will use it, along with your suggestions on not worrying about sound quality scores, continuing to add phrases via the Vocabulary Editor, and not sweating whether or not I can realistically affect the size/shape of my user vocabulary.
Thanks for your prompt response and link to your microphone comparison grid.
Mark
PS: In participating in these discussions, I am not sure whether it is preferable to find a related forum topic and then post a question onto that thread, (even the last entry might be a year or more old), or to start a new forum topic.
Just want to be respectful of the experienced users' time and effort that goes into monitoring and responding to such queries.
Thanks again.
I'd suggest making a new
I'd suggest making a new topic. Nobody REALLY remembers the old topic from a year ago. As I just tried recently, no one wants to reply, too many other offshooting tangents unrelated to my question. just start fresh, the Internet is supposed to be unlimited anyways.
Thanks
Thanks, will do.
I just did not want to make it appear that I thought my questions should preempt ongoing threads.
Mark