Registering and logging in removes this ad.
Registering and logging in removes this ad.
DNS SDK custom application - Dragon Engine stops recognizing
We have developed a medical application that utilizes the Dragon client sdk for speech recognition for transcription. We've had a painful experience thus far with the lack of Nuance/ScanSoft support and documentation. The dragon part of our application has several issues, however, the biggest issue currently is this:
The user launches the application, which internally initializes the dragon engine and several dragon controls, including the microphone control. We set the microphone to on, and the user begins speaking. At some point, sometimes right at the beginning, the engine goes deaf. The user is speaking, but no speech recognition is happening, and the corresponding dragon edit control is not receiving the recognized words. No matter how long the user waits, the words he/she has spoken are not processed by the engine, even though the microphone control shows it is in an 'on' state.
We have supplied these users with machines that are very capable for running naturally speaking. We are using Dragon 9 - Client SDK.
Please help!
As an aside, I was so happy to find an active forum discussing these issues with Dragon. Keep it up!
Eric



Thanks Eric! Regards, Skip
Thanks Eric!
Regards,
Skip
frogoildgear wrote: We have
We have developed a medical application that utilizes the Dragon client sdk for speech recognition for transcription. We've had a painful experience thus far with the lack of Nuance/ScanSoft support and documentation. The dragon part of our application has several issues, however, the biggest issue currently is this:
The user launches the application, which internally initializes the dragon engine and several dragon controls, including the microphone control. We set the microphone to on, and the user begins speaking. At some point, sometimes right at the beginning, the engine goes deaf. The user is speaking, but no speech recognition is happening, and the corresponding dragon edit control is not receiving the recognized words. No matter how long the user waits, the words he/she has spoken are not processed by the engine, even though the microphone control shows it is in an 'on' state.
We have supplied these users with machines that are very capable for running naturally speaking. We are using Dragon 9 - Client SDK.
Please help!
As an aside, I was so happy to find an active forum discussing these issues with Dragon. Keep it up!
Eric
Eric,
First, all of the speech recognition forums, or at least 99% of them, are end user forums. It is very unlikely that you will find anyone with enough knowledge of the SDK to assist you, except possibly myself.
Second, although you say that you have speech-enabled your application, are you using a run-time engine or a full working copy of DNS to support your application? This is important because if you're using a run-time engine the answer is different than if you're using a full working copy of DNS.
If you're using a full working copy of DNS, speech enabling your application is a bit of overkill. I say this because it is very likely that your code is conflicting with the DNS QuickStart resulting in a double load of natspeak.exe. This will cause your link to the recognizer (MREC .dll) to lose focus.
If you're using a licensed runtime engine (i.e. DNS without the interface), then something in your code is causing the link to the recognizer to stop working.
Unfortunately, without any more detail on how you have speech-enabled your application using the SDK it is next to impossible to assess the nature of the problems that you're having. One thing that would help is to have a copy of your Dragon log. However, you shouldn't attempt to print all of the Dragon log. Just the section that deals with initialization and any log sections that deal with the recognizer itself.
In addition, if you purchased a full copy of the DNS SDK 9.0 client addition, you should also have known that in order to get proper support you must purchase a support contract. DNS technical support is not going to be much help because they have no specific knowledge of the SDK. Only those who provide specific support for the SDK itself as far as technical support is concerned would be able to assist you with this problem.
Again, I'm only guessing at your situation and I don't have any knowledge of your code so I can't offer you any more than this without more details.
Chuck Runquist
Former DNS SDK & Senior Technical Solutions PM for DNS
If you hear the sound of hoofbeats, think horses not zebras.
Law of Parsimony (Occam's razor)
Chuck, Thank you for your
Chuck,
Thank you for your response. I put up this post hoping for someone with Dragon SDK knowledge. I guess I lucked out!
Currently, customers that use our application actually use the full working copy of DNS. Our application simply utilizes interop assemblies with dragon controls to assemble a UI that fits into our application. Our custom application allows users to speak and no matter the active application, our app collects the speech.
I have gleaned some more information about this problem in the last couple of days. Previously, we reported issues to DNS support (which by the way was the full SDK support that we purchased along with DNS sdk) about the call to SnapshotSave taking an extra long time. The response from support said that a more recent version optimized the engine and should help the problem. After waiting for weeks for Dragon to get us an updated version (we couldn't auto update because we have an OEM version), the new version didn't help out at all. We have since multi-threaded our application so it isn't as noticeable to the user that the snapshotsave is taking so long. However, I realized the last couple of days that the current problem is due to the engine being busy with the snapshotsave while the user is attempting to dictate. It doesn't seem like the engine can do both a snapshotsave and voice recognition at the same time. As soon as the call to snapshotsave finishes, the microphone control switches to 'off' (we may be doing this in our code, I'm currently checking), and all speech prior to the completion of snapshotsave is lost.
I hope this makes sense, it's incredibly difficult to describe this. It is totally possible that our use of the dragon engine is not ideal. However, I have verified that when our custom application is loaded and DNS is running, there are not multiple instances of natspeak going.
I can include an outline of our application and the process of its interface with DNS, if it would help.
Thanks again for your help and for the help you are offering to all of the DNS users out there!
Eric
frogoildgear
Chuck,
Thank you for your response. I put up this post hoping for someone with Dragon SDK knowledge. I guess I lucked out!
Currently, customers that use our application actually use the full working copy of DNS. Our application simply utilizes interop assemblies with dragon controls to assemble a UI that fits into our application. Our custom application allows users to speak and no matter the active application, our app collects the speech.
I have gleaned some more information about this problem in the last couple of days. Previously, we reported issues to DNS support (which by the way was the full SDK support that we purchased along with DNS sdk) about the call to SnapshotSave taking an extra long time. The response from support said that a more recent version optimized the engine and should help the problem. After waiting for weeks for Dragon to get us an updated version (we couldn't auto update because we have an OEM version), the new version didn't help out at all. We have since multi-threaded our application so it isn't as noticeable to the user that the snapshotsave is taking so long. However, I realized the last couple of days that the current problem is due to the engine being busy with the snapshotsave while the user is attempting to dictate. It doesn't seem like the engine can do both a snapshotsave and voice recognition at the same time. As soon as the call to snapshotsave finishes, the microphone control switches to 'off' (we may be doing this in our code, I'm currently checking), and all speech prior to the completion of snapshotsave is lost.
I hope this makes sense, it's incredibly difficult to describe this. It is totally possible that our use of the dragon engine is not ideal. However, I have verified that when our custom application is loaded and DNS is running, there are not multiple instances of natspeak going.
I can include an outline of our application and the process of its interface with DNS, if it would help.
Thanks again for your help and for the help you are offering to all of the DNS users out there!
Eric
Eric,
I seem to recall that I saw this problem once before and that the solution was relatively simple. If I remember correctly it did not have anything to do with the modules that you were using, but with the ability to locate them properly. I would have to go back through my notes and see if I can find this. However, your offer to provide the outline is greatly appreciated and would be most helpful. Sometimes seeing something triggers my memory about how the SDK works under certain conditions. If you can send that, or post it, I'll take a look at it and see if we can find a solution. Regardless, this is one case where think the solution is simpler than it looks.
Chuck Runquist
Former Dragon NaturallySpeaking SDK & Senior Technical Solutions PM for DNS
If computers get too powerful, we can organize them into a committee - that will do them in. - Bradley's Bromide
Chuck, Thanks for your
Chuck,
Thanks for your reply. I sure hope this has an easy solution!
Here's some more background on our application. We utilize the DNS SDK ActiveX controls to allow radiologists to utilize voice recognition from within our product. We are using the activeX controls via Interop dll's in our C# project. I understand that C# is not officially supported by the DNS SDK, so if this is the problem we will have to go in a different direction.
At any rate, here are the steps we go through in launching our application.
1) Upon the form loading we initialize (not register) the following controls:
a. Dragon Engine control.
b. Drgon Microphone control.
c. Dragon Dictation Edit control.
2) We check the list of speakers in the dragon engine against the authenticated user, if found we set the engine's speaker property to the user.
3) Next we register the Microphone control by calling its .Register method.
4) We set 2 compatibility modules in the engine via the following calls:
a. .set_CompatibilityModule(DNSTools.DgnCompatibilityModuleConstants.dgncompmoduleEditControlSupport, this.Handle.ToInt32(), false);
b. .set_CompatibilityModule(DNSTools.DgnCompatibilityModuleConstants.dgncompmoduleNatText, this.Handle.ToInt32(), false);
5) We set the EngineUI property of the engine to dgnenguiTrayIcon.
6) We call the engine's .Register method.
7) We call the .Register method on the Dragon Dictation Edit control, passing in the handle to the textbox we want to use.
8.) We set the hWndActivate property of the Dragon Dictation Edit control to 0, so we can capture all audio regardless of whether our application has focus or not.
9) We turn the microphone control on.
...The user dictates into our application and when they are done they click our Save button which does the following:
10) The GUI thread spawns a background thread that does the following:
a. Immediately calls the SnapshotSave method of the Dragon Dictation Edit control, passing in both an indexed file name and a wave file name. Both of these paths are on the local (client's) hard disk. It is this call which can take anywhere from 30 to 90 seconds depending on the machine.
b. We read the indexed file and the wave file from disk and store them in a custom object. This object is written back to disk and a completely different thread reads this object and sends it to our server.
11. Our main GUI thread hides the form. If all goes well on the background thread then the form is closed.
It is at this point that if the user immediately launches our application again then the microphone button looks like it is on, the engine looks like it is recognizing the users voice, but after 10 - 30 seconds, the microphone button switches to 'off' and all of the speech prior to that point is lost. On this second launch, we carry out the same steps as listed above.
I'm trying to spare any cumbersome details here, so please let me know if you have any questions.
Thanks again!
Eric