getting the words rather than the punctuation

How do you convince Dragon NaturallySpeaking to insert a word like "asterisk" instead of the punctuation mark? Or for example I often use the phrase "At one point" and it comes out "At 1."

Mike

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

mossey wrote:How do you

mossey wrote:

How do you convince Dragon NaturallySpeaking to insert a word like "asterisk" instead of the punctuation mark?

In the Vocabulary Editor find the '*/asterisk' line and replace the Spoken form with something like "star symbol" or "asterisk symbol" and Add (You will need to then delete the original entry).

The word asteisk (not associated with the symbol) is already in the vocabulary.

mossey wrote:

I often use the phrase "At one point" and it comes out "At 1."

Add the whole phrase "at one point" to the Vocabulary.

Graham Hendry

www.itspeaking.co.uk

Thanks, Graham. I was hoping

Thanks, Graham. I was hoping there was some general way to tell it to use the word rather than the symbol. After all, I may use any number of words such as "open quote" "left paren" etc. and want them to come out as words, and I don't want to define vocabulary for each (or change the existing vocabulary.. it would be a memory strain).

Likewise a more general solution to "At one point" is some way of telling it explicitly to use "one" and "point".. so I don't have to add special phrases every time the issue comes up.

Thanks very much for your help.

Mike

Perhaps a brief discussion

Perhaps a brief discussion of how DNS interprets your dictation with regard to symbols/words (apologies if this is "teaching granny ...").

The application hears your utterance (utterence = a string of phonemes between pauses) and then attempts to transcribe these as words or symbols (./point, "/open quote, etc.). In very many cases it has to decide between true homophones and even an even greater number of near-homophones. This is done by reference to a probability matrix of pairs of "words" (a symbol is a "word" in the Vocabulary) and word triplets (bi-grams and tri-grams). The base Language Model originally installed by Dragon contains the probability of the occurence of pairs and triplets determined from linguistic studies. You can refine the probabilities to represent your own writing style by analysing your typical documents and/or by correcting mis-recognitions in context (the bi-grams and tri-grams again).

To take the example of "point". If I say "one two point three four" as a single utterance (or if I have Formatting Options checked to allow pauses in numbers) the very strong probability is that I want "12.34" transcribed. If I say "I wish to argue against one point in your proposal" (again as a single utterance) the context is such that "point" will be transcribed spelled out.

Put very simply - there is no way Dragon can distinguish between homophones without context information. For symbols such as "left paren" it becomes even more difficult if the user wishes to be able to have the same spoken form for both the words and the symbol in the same document. The answer is to use a spoken form for the symbol that you would not use spelled out (e.g. open-bracket, right-bracket) and possibly delete the alternatives from the vocabulary.

Certainly this involves a little initial "brain strain" but if you are using the two forms frequently enough for it to be a problem then you should quickly adapt.

If you don't add frequently used (and initially mis-recognised) phrases to the Vocabulary you are ignoring the single most important method of increasing accuracy in DNS.
Adding a phrase is very simple. When you have it correctly in your document just select it and say "Make that a Phrase". The phrase will be inserted into the Written Form of the Vocabulary and you just need to say "Click Add". [You can add up to 127 characters to the written and spoken form in this way]

To obtain high accuracy with NaturallySpeaking:

1. Speak in phrases or sentences and enunciate clearly.
2. Analyse typical documents for your writing style.
3. Add frequently used phrases to the Vocabulary (and back up your custom word/phrase file regularly using GetWords/PutWords)
4. When Dragon makes a mistake think why and develop strategies to avoid it (like re-naming symbols).

Graham Hendry

www.itspeaking.co.uk

What a hip granny you are!

What a hip granny you are! There ought to be a way to make this a "sticky" post so that people can find it without much effort.

Bruce

admin's picture

BruceCyr wrote:What a hip

BruceCyr wrote:

What a hip granny you are! There ought to be a way to make this a "sticky" post so that people can find it without much effort.

Bruce

It has been added to the FAQ for Dragon.

As it is Hogmanay there is

As it is Hogmanay there is the possibility that Grandfather Graham might be identified as Granny Graham. Ignoring, that is, the long white beard. Smiling And I won't be sucking eggs at midnight!

Graham

Chuck Runquist's picture

This explanation is OK, but

This explanation is OK, but it is not technically correct. The process is more complex than this and involves a step by step conversion of the analog input to digital; comparison to the Acoustic Model to produce a binary phonetic equivalent of words, phrases, and pauses based on the speaker's pronounciations (training & Vocabulary) and the settings in the DNS Options dialog for pause timing (i.e., how pauses are detected); the results are then analyzed using the Language Model for context. The language model consists of a complex set of Hidden Markov Models based on the precalculated probability coeficients derived from common usage. There are several models and a number of calculation algorithms used to do this, but essentially each branch of the bigram/trigram HMM trees are searched for the most likely match. Whether a bigram or trigram model is used is based on the length of each utterance preceding a pause. If an utterance consists of less than 5 words, the bigram model is used. If 5 words or more precede a pause, the trigram model is used. These models work by analyzing the right and left contexts for each three word (bigram) or five word (trigram) combination. Next the 9 best matches in order of probability are calculated and the match with the highest probability is selected for display (transcription).

Therefore:

1. What you train and how you train affects accuracy.

Training individual words, even in the Vocabulary Editor, has no impact on accuracy because all the words contained in the active vocabulary (160,000 words loaded at runtime) and/or the Background vocabulary (what is stored in C:\Documents and Settings\All Users\Application Data\ScanSoft\NaturallySpeaking8\Data\Enx\enx_aus_general_large) have a fixed probability that does not change regardless of training. Changing the probability coeficeints of these words would result in accuracy degredation on a scale that would render DNS useless. Suffice it to say that this is necessary for the proper recognition of words. Remember that the training script makes reference to "...words by themselves AND in the context of other words." What this means is that there is a distinction between the handling of individual words that can be modified only by training words in context.

Training words in context alters the context probability coeficients because this is where DNS distinguishes individual words by the context in which they are used, and this is what the AND means in the script phrase above. So if I just say "an" and get "and" it is because in the vocabulary and in common speech "and" usually occurs before "an". Training these two words individualy will not change this probability. But, traing the phrase "He had an awful cold." will increase the probability that "an" will be correctly used in this context. Therefore, this is why analyzing documents increases the accuracy of words used in context based on how the user writes/dictates.

It is important to remember that how we say words and phrases during dictation is unique and distinctly different from how we say them when we train them. For example, when you dictate the phrase "Training individual words, even in the Vocabulary Editor, has no IMPACT on accuracy..." and it is transcribed as "Training individual words, even in the Vocabulary Editor, has no IMPORT on accuracy...", and you say "Correct has no IMPORT on accuracy", we all pronounce that phrase differently when taken out of the context of the entire sentence. In addition, we virtually never say the same thing in the same way twice anyway. Therefore, when oorrecting this phrase to change "IMPORT" to "IMPACT", if the playback in the Spell dialog is clear and correct, retraining the phrase alters the acoustic equivalent and, done too often, can result in eventual corruption of the words in that context. Multiply this by 100's of corrections per day and it is easy to see why some user files eventually become significantly corrupted.

On the other hand, if there is no playback, the user has to make a choice. Did they say it correctly and clearly or not. If in doubt, then train it. No training results in the continued link between the original spoken utterance and the correction. If this is, or was, incorrect, then the correction will be associated with an incorrect acoustical utterance. Training in this context does not hurt because it will only show the one phrase, not the corrected and incorrect words or phrases, and training will create a correct, even if slightly different out of context pronunciation, acoustical/word or phrase link. In addition, no training will result in a correction that will not be saved for later use by the Acoustic Optimizer. Only words/phrases trained during correction or during additional training are recorded for update by the Acoustic Optimizer. Remember that the key word is Acoustic, not Language relative to the Acoustic Optimizer.

2. General training should be done once.

It is only necessary to perform the general training one time. If accuracy, which should be tested before performing any additional training, does not produce at least 97 to 99% accuracy, performing additional training is not likely to improve accuracy and, in many cases where the accuracy is below 90% after initial training, will likely degrade such. The reason for this is that the hardware and/or user speech are the primary cause. In short, check your pronunciation and listen to your self speak via playback (we sound significantly different when listening to a recording or our speech than we do otherwise), and check your hardware. In most cases it is the combination of microphone, system performance and soundcard/input device that are at fault. You should never have to perform more than two general trainings, and a second training is only viable on rare occasions when you have changed the way you speak or your hardware changes, AND when your initial training has produced high initial accuracy. Beyond this, proper correction and enunciation will produce the best results in improving accuracy, and/or in maintaining it.

Keep in mind that performing initial training several times before using DNS is a waist of time. The reason is that the general training does not set the Acoustic Model based on words, it listens for and sets the user's pronunciation for each of the 40 or so base phonems in the given DNS Language version. This is why the the content of the training script does not matter as long as it is takes approximately 5 minutes or more to read. The Adaption process then mathes these to the Active Vocabulary pronounciations to form the user's Acoustical Model (i.e., how the user pronounces the phonetic reprentations that make up each vocabulary word).

3. The original question.

First, be careful how you deal with punctuation that you want spelled out. Adding anything to the Vocabulary that conflicts with current word forms, including punctuation, can cause sproadic misrecognitions. This includes using the Alternate form or the spoken form.

Second, in most cases you should be able to get the desired result by analyzing documents for your writing style in which you use such phrases frequently. One of the more experienced users who monitors this group uses a technique that is very creative. He creates documents using his list of most frequently used/misrecognized words/phrases in the context of a written document and used many times, then repeatedly has DNS analyze that document.

Third, add commonly used phrases to the Vocabulary. For example if you want "At one point" to be displayed that way, try entering it first as At one point with no special spoken form or Alternate form. If that works, use it that way. If not, try entering it as it is generally recognized (i.e., Written form asAt1., the Spoken form as At one point, and the Alternate form as At one point. This should work, but the rule of thumb is add only what you need, do it the simplest way, and avoid conflict with other vocabulary entries.

Lastly, Spoken forms work best when they are entered as quasi phonetic or phonetic equivalents. For example, if you want AKG dislplayed, set AKG as the Written form, Ae-Kay-Gee and the Spoken form, and AKG as the Alternate form (just to be sure it gets displayed the way you want).

Another final rule of thumb. Don't blame DNS until or unless you are absolutely sure that the problem is not a result of PEBCAC (Problems Exists Between Computer and Chair)Eye-wink. That is said tung-in-cheek and not intended to offend. It is a simple technical support joke that recognizes that most problems are solved at home.

Chuck Runquist
Former DNS SDK & Senior Technical Solutions PM for DNS with Lernout & Hauspie (L&H)

Chuck Runquist wrote:This

Chuck Runquist wrote:

This explanation is OK, but it is not technically correct.

I must apologise if my lack of technical correctness caused any misunderstanding of the complex processes involved in transcription using NaturallySpeaking. I was attempting to curb my usual tendency to go into too much detail without answering the question. Smiling

In defence of my lack of detailed knowledge of the internal workings of NaturallySpeaking I must say that it is not for want of trying. Having been involved, as a VAR, since the launch of DNS 0.9 (and before that with Dragon Dictate) the ONLY detail has come from Chuck in posts on this and other forums. In spite of many requests this sort of information has not been given by Dragon Systems, L&H and ScanSoft (the jury is still out for Nuance).

Once again Chuck thanks for more gems to add to my personal understanding of the workings of the Dragon beast. Come midnight I'll raise a beaker to you in thanks and anticipation of many more (sorry it's only a virtual Laphroig on your side of the pond).
.
Graham

Chuck Runquist's picture

IT Speaking wrote:I must

IT Speaking wrote:

I must apologise if my lack of technical correctness caused any misunderstanding of the complex processes involved in transcription using NaturallySpeaking. I was attempting to curb my usual tendency to go into too much detail without answering the question. :-)

Graham,

No need to apologize. Your explanation was adequate and very close. The reason that I added my two cents is that DNS 8 is a major revision. Technically, it is equivalent to going from DNS 3 to DNS 4. New features were added, and those that have been around since version 4 were enhanced and improved to the degree that it is more important than before to understand them more clearly. DNS 8 actually minimizes the propensity for user mistakes to cause more serious problems as in past versions, but must be understood more clearly in order to take full advantage of the changes.

I simply filled in the gaps. You have nothing you need to apologize for.

IT Speaking wrote:

In defence of my lack of detailed knowledge of the internal workings of NaturallySpeaking I must say that it is not for want of trying. Having been involved, as a VAR, since the launch of DNS 0.9 (and before that with Dragon Dictate) the ONLY detail has come from Chuck in posts on this and other forums. In spite of many requests this sort of information has not been given by Dragon Systems, L&H and ScanSoft (the jury is still out for Nuance).

Once again Chuck thanks for more gems to add to my personal understanding of the workings of the Dragon beast. Come midnight I'll raise a beaker to you in thanks and anticipation of many more (sorry it's only a virtual Laphroig on your side of the pond).
.
Graham

Thank you for the kind words. My two cents is given because the assumptions made by Dragon Systems, L&H, and ScanSoft, now Nuance, about what users, particularly new users, need to know needs updating.

Your two cents is equally valuable and important to this process. My goal is to help those with more understanding and knowledge of DNS, like yourself, become more effective in helping others. I can see by your original post that we are all getting better at the SR game. It seems to be working.

Chuck

Is there some place you can

Is there some place you can send me, a newbie, to learn step-by-step how to train on phrases? And I guess on words. For instance, I don't know what the "Alternate form" refers to. Apparently, it's a term of art.

Another confusion: when I get the training dialogue box, I don't know what "written form" vs. "spoken form" refers to. Say that I just said "Ayn Rand" and what appeared was "Ein rant." I select it, go to correction, and want to train it. The written form comes up as "Ein rant." Do I change this to "Ayn Rand"? Do I type in "Ayn Rand"?

I haven't read the whole manual, but skimming it didn't result in my finding this.

Thanks, in advance,
Harry

Quote: How do you convince

Quote:

How do you convince Dragon NaturallySpeaking to insert a word like "asterisk" instead of the punctuation mark? Or for example I often use the phrase "At one point" and it comes out "At 1."

Mike

The trick I use whenever I want to write the punctuation word(s) instead of the punctuation symbol(s) is to simply dictate the plural of the punctuation I want written i.e. periods, commas, question marks, colons -- and then I say backspace to remove the letter "s" -- not sophisticated, but it works.

Hope this helps,

-Coop

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.




view recent posts