Language multiplicity in version 9
LANGUAGE MULTIPLICITY IN DRAGON NATURALLYSPEAKING VERSION 9
I have finally installed Dragon NaturallySpeaking, version 9, for the following languages: English US, English UK, Spanish, French, Dutch, and German. English and Spanish were installed from the Spanish CD version; French, Dutch and German were installed from the Dutch CD version. The bottom line is that all of these languages work quite well, and switching between them is relatively short, though far from ideal. Of course, everything must be evaluated in comparative terms. If the alternative is the way Windows Vista is designed for language switching, DNS definitely offers the best alternative.
Although the final result is satisfactory, the installation process has not been without problems. While the Spanish version installation made it possible to select the desirable components for installation, the Dutch package simply went ahead without offering any selection, installing all of the components, including all of the English language varieties which I don't need at all. In addition, while activation was required after the installation of the Spanish version, none was required after the installation of the Dutch package. There was no registration process either.
The installation of individual languages/modules hasn't been without problems either. Having installed and trained the German module, I had to eventually completely uninstall it because it simply didn't work. Only after a second installation, without any training, did it install and work properly.
Importing previous profiles didn't work either. Each time I tried to import an English-language profile, I was asked to upgrade the profile to the new format and then I was asked to select between Spain's Spanish and Latin American Spanish. As a result, I couldn't import any of my previous English profiles, which, although not a great loss, is still a waste of time because of older vocabulary items I need to incorporate in the new version.
All in all, the architecture of multiple languages does work eventually, but untrained users may give up in front of so many difficulties. It is still a far cry from what could have been expected, and naturally remote from any desirable architecture of multilingualism. I would like to insist that language multiplicity is not equal to multilingualism, and that DNS version 9 is at best an application allowing language multiplicity.
Itamar Even-Zohar



Itamar, Multilingualism, as
Itamar,
Multilingualism, as you describe it, would take an entire year of dedicated R&D to accomplish programmatically. The problem is extremely complex from an SR standpoint.
The priority on something like this is so low that you're not going to see it for at least several years. That Nuance has acquiesced to solving part of the problem is significant in and of itself.
Chuck Runquist
Former DNS SDK & Senior Technical Solutions PM for DNS
Itamar
Itamar Even-Zohar
itamar@even-zohar.com
Chuck,
Of course you are right; I know it is not a simple matter, although Philips had it in FreeSpeech 2000 years ago. DNS still offers better multilinguality (not multiligualism) than Windows Vista. My comments only meant to tell that Ver. 9 is less versatile than Ver. 7. in installation and other parameters.
Itamar Even-Zohar
Hello Chuck, Your answers
Hello Chuck,
Your answers trigger again some questions:
1. Why is this so extremely difficult?
2. Why is this not important enough to tackle?
I personally would think speech recognition in more languages is extremely important, both for lots of users as for the firm that sells the product. The design for integrating the different language versions should be the basis of the whole product. If it is not already I think Nuance would be wise to work on this pretty soon.
My take, greetings, Quintijn
Quintijn wrote: Hello
Hello Chuck,
Your answers trigger again some questions:
1. Why is this so extremely difficult?
2. Why is this not important enough to tackle?
I personally would think speech recognition in more languages is extremely important, both for lots of users as for the firm that sells the product. The design for integrating the different language versions should be the basis of the whole product. If it is not already I think Nuance would be wise to work on this pretty soon.
My take, greetings, Quintijn
Quntijin,
The reason that this is so extremely difficult is because switching between languages requires extensive modification of the MREC engine to accommodate language switches. It is simply not the same as switching users. In this case you are switching languages. Just to highlight one of the problems; when you do something as extensive as switching languages on-the-fly, the required revisions in the code have a tremendous potential for breaking something. I'm not going to get into details, but the first thing that would have to be done would be to completely rewrite the SDK. All application development is based on the SDK. That is written first, then the GUI. The rewrite of the SDK to accommodate switching languages on-the-fly would take at least 6 months to a year and simply is not feasible, which answers your second question, at least in part. Nevertheless, and to reiterate, switching between users is a relatively simple process, switching between the languages (on-the-fly) increases the amount of code, means a rewrite of a vast majority of the SDK, and the propensity for something ending up broken across all DNS functionality. Writing SR applications is not like writing a program like Microsoft Word. It is a very very complex process. There is no more difficult program to write than SR, with the exception of a compiler like C++.
The other basic reason why this is not important enough to tackle at this time is simply that the demand for this type of integration ($$$) does not justify the R&D costs. Whether users like it or not, companies are in business to make money , not lose it. Having talked with development about this issue, the bottom line is that the amount of work required to develop this aspect of multilingualism, as Itamar has defined it, would not be paid for in sales. If and when the demand increases to the point where this becomes viable, Nuance will make the appropriate investment. At this particular point, this is simply not a money maker. It would cost more to create what Itamar wants than Nuance would benefit from the sales. Users have to remember that, from a marketing and sales perspective, this is a gamble. You may be right. There may be increased demand in the future. However, if your bottom line was influenced by the amount of money that you would have to put out vs. the amount of money that you would get back, and the latter is neither guaranteed nor very probable at this point in time based on demand, would you risk your investment? I don't think so. Users are very free with Nuance his money. But if Nuance loses, they don't. It's important to keep this in mind. I been involved in this kind of decision making during my tenure as SDK PM. I may not agree with this kind of decision making in all cases, and in fact I didn't, but sometimes the bean counters are correct in their assessment of cost vs. benefit. I think in this case, at this particular point in time, they're correct; even though I tend to agree with you. We'll just have to wait and see.
The vast majority of users of DNS are not interested in multiple languages or multilingual switching. Nuance was gracious enough to acquiesce to my request on behalf of users like Itamar to make multiple language versions installable on the same system. The reason that this was done is because there is a market for multiple version (Legal & Medical) installable on the same system. Because the process for installing multiple versions and multiple languages is the same, this was feasible both from a programming perspective and a marketing perspective. In other words, $$$.
Chuck Runquist
Former DNS SDK & Senior Technical Solutions PM for DNS
Itamar
Itamar Even-Zohar
itamar@even-zohar.com
Chuck,
Perhaps Nuance are right in assessing that the market for real multilingualism is currently relatively small. However, it would be shortsighted of them not to understand that they can capitalize on their advantage in the field of languages. Microsoft now engages in a full-speed creation of SR for more languages. English US, English UK, French, Spanish, German, Chinese and Japanese are already supported by Windows Vista. I have tested the Spanish and the French modules and found them excellent (indeed, better than the English). However, the trouble is that the way Vista provides for switching between languages is ridiculous. No one can really take it seriously. Therefore, had I been Nuance, I would have advertised my product as 'the only SR product on the market that allows multilingual use'. I would invest in improving this feature.
Markets can be analyzed under two different perspectives: the current state and the possible future states. Nuance is strong enough to make future market analysis. In the European Union alone (as Quintijn would certainly confirm) there is a huge need for multilingualism. If Nuance sells only to the people at the headquarters in Brussels it will have covered all of the investment of making DNS fully multilingual.
Itamar Even-Zohar
Nevertheless, and to
Nevertheless, and to reiterate, switching between users is a relatively simple process, switching between the languages (on-the-fly) increases the amount of code, means a rewrite of a vast majority of the SDK
Hi Chuck,
My experience is that switching between users is as easy with different language users as between 2 english users. So I do not understand that part of your argument.
Also integrating core NatSpeak would exclude very annoying errors. As example the very annoying "choose n" problem, that still exists in the Dutch version 9. It was fixed in version 7 english. (The spell window misses the dutch equivalent of "choose n"). That is a shame, and it could be fixed by proper integration of all the language versions.
As to the other issues I completely agree with Itamar.
Greetings, Quintijn
PS most "unimacro" grammars are integrated for multilingual use. Nuance could learn from that!
apologies for the double
apologies for the double posting, the "quote" command did not work as I wanted
Nevertheless, and to reiterate, switching between users is a relatively simple process, switching between the languages (on-the-fly) increases the amount of code, means a rewrite of a vast majority of the SDK
Hi Chuck,
My experience is that switching between users is as easy with different language users as between 2 english users. So I do not understand that part of your argument.
Also integrating core NatSpeak would exclude very annoying errors. As example the very annoying "choose n" problem, that still exists in the Dutch version 9. It was fixed in version 7 english. (The spell window misses the dutch equivalent of "choose n"). That is a shame, and it could be fixed by proper integration of all the language versions.
As to the other issues I completely agree with Itamar.
Greetings, Quintijn
PS most "unimacro" grammars are integrated for multilingual use. Nuance could learn from that!
Quintijn
Nevertheless, and to reiterate, switching between users is a relatively simple process, switching between the languages (on-the-fly) increases the amount of code, means a rewrite of a vast majority of the SDK
Hi Chuck,
My experience is that switching between users is as easy with different language users as between 2 english users. So I do not understand that part of your argument.
Also integrating core NatSpeak would exclude very annoying errors. As example the very annoying "choose n" problem, that still exists in the Dutch version 9. It was fixed in version 7 english. (The spell window misses the dutch equivalent of "choose n"). That is a shame, and it could be fixed by proper integration of all the language versions.
As to the other issues I completely agree with Itamar.
Greetings, Quintijn
PS most "unimacro" grammars are integrated for multilingual use. Nuance could learn from that!
Quintijn,
Sorry for delaying my response to your post. However, here's the problem. Itamar distinguishes between multilingual and multilanguage as being the difference between being able to dictate in multiple languages at the same time (multilingual) vs. being able to switch between languages (multilanguage).
The latter is easy, and it is what Nuance has done by virtue of correcting the problem with the installer (MSI) files. This allows multiple languages and multiple flavors of DNS to be installed on the same system. In a post in another location, Itamar brings up the interesting phenomenon when he posed the question as to why the Dutch version did not prompt for activation or registration. This is because when you install multiple languages and/or multiple flavors of DNS on the same system, under most conditions a second installation is treated as an update/upgrade and the previous version installation serial number and activation are acknowledged by the installer (DNS MSI) files.
However, you have to keep in mind that being able to dictate in several languages using the same user would require multiple vocabularies and multiple language models. While this could be done, it would increase the amount of memory that DNS would have to use in both storing and switching between the various language models and vocabularies to the point of being almost prohibitive for the average user. In order to be able to dictate in French and then switch to English with the same user, both the French and the English vocabularies and language models would have to be stored in memory and the DNS recognizer (MREC engine) would have to be able to recognize both languages simultaneously. This would be a monster programming task requiring at least triple the amount of RAM currently being used by DNS, as well as slowing the whole process down while DNS attempts to determine which language is being used. In short, there are limitations that would have to be overcome both in terms of the way the software is written and the way the hardware handles it. Not as easy as you assume and not particularly practical given the amount of work, resources, R&D time, cost, etc. It's just not practical or feasible at this particular point to even consider doing this at the present time. In addition, there simply is no market for this approach. Yes, it would be nice for people who really need it. However, how many people really need it right now. If Nuance were to engage in this kind of revision to allow this kind of multilingual capability, you would end up paying probably $10-$20000 for such.
Chuck Runquist
Former DNS SDK & Senior Technical Solutions PM for DNS
Chuck, Thanks for the common
Chuck,
Thanks for the common sense grounding to this issue -- if I had a lick of sense I would have realized that myself!
It does seem to me that the SR makers could accommodate the unique needs of the pure multilingualizers (!) simply by allowing them to load as many languages as desired and switch among them with commands like "Switch to X-speak". It would be too slow to require current systems to distinguish the correct language automatically, yet this option would allow the multilingualizer to switch the recognition context explicitly without much of a processing penalty.
Seems to me this would be a natural extension of the Pro model, although it should allow each additional language pack to be bought at non-Pro prices -- perhaps a multilingual option to be purchased at an intermediate price between the Pro and non-Pro versions with just the language switching command capabilities.
This solution imposes a minimal procedural penalty on the multilingualizer and puts the onus on him/her in terms of equipping a system with sufficient resources -- primarily, I would guess, memory.
Bruce
BruceCyr
Chuck,
Thanks for the common sense grounding to this issue -- if I had a lick of sense I would have realized that myself!
It does seem to me that the SR makers could accommodate the unique needs of the pure multilingualizers (!) simply by allowing them to load as many languages as desired and switch among them with commands like "Switch to X-speak". It would be too slow to require current systems to distinguish the correct language automatically, yet this option would allow the multilingualizer to switch the recognition context explicitly without much of a processing penalty.
Seems to me this would be a natural extension of the Pro model, although it should allow each additional language pack to be bought at non-Pro prices -- perhaps a multilingual option to be purchased at an intermediate price between the Pro and non-Pro versions with just the language switching command capabilities.
This solution imposes a minimal procedural penalty on the multilingualizer and puts the onus on him/her in terms of equipping a system with sufficient resources -- primarily, I would guess, memory.
Bruce
Bruce,
Now, what you suggest would be feasible and not require a whole lot of work. Actually, this could be done similar to switching users.
For example, it is difficult to write a macro to change users on-the-fly because the macro would have to be modified for each user installed, and users are not fixed. That is, they are dynamic (dependent upon each DNS user created). However, it would be practical and feasible to change languages on-the-fly transparently similarly to the way that you change modes. This would be feasible because it could be written as a simple natural language command. It would be practical because it would only need to detect the languages installed and could be as simple as changing modes. You can currently say (i.e., in DNS 9) "normal mode", "spell mode", etc. With regard to languages, if DNS detects the current languages installed, you could have a simple command that would (on-the-fly) transparently switch between languages simply by issuing, for example, the command "Spanish". DNS would detect that there is a Spanish user, switch to that user, and conclude with a TTS playback that that language user is ready by playing back (TTS) the fact that it's ready in that language.
There would obviously be a pause while the users are being switched, but just like changing modes, you would be able to use whatever language you wanted simply by issuing the command for that language.
Obviously you would have to have some traps, such as telling the user that a specific language/user is unavailable, but this would generally be very practical and doable given the current level of technology (SR software/hardware).
Chuck Runquist
Former DNS SDK & Senior Technical Solutions PM for DNS
This has been an interesting
This has been an interesting thread to follow. Indeed, it has been fascinating to follow Itamar over the years in advocating for SR multilingualism. Parenthetically, I should note that he uses "multilingualism" to mean, I think, the ability to dictate any language at any time in any application; whereas "language multiplicity" seems to mean the ability to switch under command from dictating in one language to another language.
Basically both Nuance and MS are saying that if SR manufacturer A has captured the entire SR market for multilingual users, the additional revenue to be gained by manufacturer B by taking that entire sales volume is not worth the additional effort. In fact, it probably wouldn't amount to the value of one programmer year because of the low rate of multilingual use.
I understand the frustration that comes from being unable to influence a manufacturer who makes something critical for your life, because many of the available mobility assistance products seem to be designed by a lawsuit rather than for the needs of the target population. So maybe the current strategy of reacting to what a manufacturer presents is not the most productive way to pursue this goal.
I have two not especially promising suggestions, but maybe they will spur some more useful contributions.
In the first place, it might be useful to write a proactive article illustrating the workflow of an SR multilingual user. In other words, start with a description of how you work -- to make sure you get all the details -- and then overlay notes about exactly how a multilingual SR product should be designed to facilitate that workflow. The suggestion has at least two practical benefits. On the one hand, it allows the program designer to understand exactly what practical design points to aim for. On the other hand, it may give him intermediate targets that would be economical to implement without necessarily doing a complete rework. Both the manufacture and the user would benefit from this closer meshing of interests and objectives.
In the second place, it might be worthwhile to promulgate the use of SR software amongst target populations of potential users. In other words, leverage the demand side of the equation. Also work to increase the general rate of multilingualism. I would note that aside a brief blip in the post-Sputnik era, foreign language training in the US has been woefully negligient, largely because English has become the lingua franca of international discourse, which is not necessarily such an advantage to US national interests that it might seem at first glance.
HTH,
Bruce
Itamar
Itamar Even-Zohar
itamar@even-zohar.com
Bruce,
Thank you for a most useful contribution and for your sympathetic approach. I am not sure that multilingualism is not requested by quite many users. No one has really investigated the current and the potential markets. I know that in the European Union alone, millions of Euros are spent on translations to some 7-8 languages, and there are quite a few people engaged there in producing multilingual documents. They may not currently use speech recognition, but they could be persuaded to try it if it offered them a working solution.
DNS "multilinguality" (language multiplicity) is acceptable though far from ideal. I actually like the neat division of labor between "users". This allows easy management of different profiles, even for the same language. I deplore the fact that the MS SR team has decided to eliminate "users" altogether in Windows Vista Speech. As for language switching, Windows Vista provides a grotesque procedure. There is really no nicer word to describe it. No one can seriously accept it as a viable method. As for writing a document about multilingualism, I have actually done that long ago, and I had then an interesting exchange with David Mowatt, who was then the head of the SR team (now replaced by Rob Chambers). It's on my speech website: http://speech.even-zohar.com.
I believe Chuck has been very helpful in convincing Nuance to allow multilinguality (blocked in version eight) work in version 9, for which I am very grateful. Naturally I wish they improved it a bit, and made the purchase of language modules more accessible (when you buy extra versions, only the additional language module is installed, so most of the stuff on the CD is not even used.).
Itamar Even-Zohar
My problem vis-a-vis your
My problem vis-a-vis your situation is that I don't understand why a translator would need more than one language at a time, no matter how many languages s/he might use. Seems to me you would be translating from one language to another, so I don't understand how you would need to dictate into a third language.
Maybe the developers have the same problem.
So I think to motivate progress you need to specify the problem in more detail and try to work with the developers to achieve intermediate steps towards the ultimate end.
Bruce
Itamar
Itamar Even-Zohar
itamar@even-zohar.com
Bruce,
You are absolutely right: a translator does not need more than one language (at a time, at least). I am not a translator and do not claim to represent the needs of translators. I was referring to people who need to create texts in several languages, whether as different documents or within one and the same document. I know various people who write in both French and Spanish, German and Dutch, Dutch, English and French, and so on. I myself need to create texts in English, Spanish, and (less frequently) French. These people would like to have the various languages easily accessible to them.
The DNS architecture for switching between languages is not a bad solution, although it takes time to unload a user and load a different one. It normally takes about 1 minute, which is not at all bad. Of course it IS bad if you need to dictate one passage in English, then dictate a quotation in French, then back to English. You may end by deciding to type the quotation rather than dictate it, although if you are a better typist in English, typing the accents in French cannot be carried out with the US/UK keyboard, and using the French keyboard (which is AZERTI and has all sorts of peculiarities) is a nightmare. (The solution *I* use for typing French is to use a Spanish keyboard, which is QWERTI and has all accents relatively easily accessible [with the ALT- key]). I can tell you that for non-French people, dictation is the QUICKEST way for producing a French text correctly. Of course it would have been much better to have a more flexible method for switching, but as Chuck pointed out, it is not likely to be implemented in any near future, and therefore would be futile to request now.
A different issue is the marketing of the DNS languages, which is a painful business. Right now the marketing system is quite mediaeval: each language is purchasable from some local national agency so one must order those versions from various different countries. This has some bearings on the way the CDs are prepared. Each is prepared as a stand-alone program, even though when you install them one on top of the other, only a small part of the materials is used, namely only the extra language module. Once the engine (kernel) was installed by the first program you selected, all the other programs are not re-installed in full, but only the language module is selected. The size of such modules is quite manageable. It is quite ordinary today to have 400-700MB applications purchasable by download (Adobe, Roxio, Sony etc.), and the DNS modules are much smaller than that. It is such a nuisance to order 4-5 different languages from different agencies. I though that DNS, side by side with these full-fledged versions would also do better commercially if they made it possible to buy from their e-store the extra language modules. One would then buy one major program and then additional modules by download. To do this, the company must also change the structure of the product, which it is not willing to do right now, because of the assumption (which I find unsubstantiated) that there is not enough market to justify this.
Itamar Even-Zohar
PS. There is a slight comfort in the fact that if one buys the Dutch package, one gets 4 languages bundled together: Dutch, English (all 5 varieties), French, and German. However, Spanish and Italian must be purchased separately, each in a different country (both are bundled with English, so you get English 3 times if you buy the Dutch, the Spanish, and the Italian package). However, not everybody can handle the Dutch package, which by default runs the installation process in Dutch (even if you choose not to install the Dutch module itself).
Itamar, I can only speak for
Itamar,
I can only speak for the UK but here it is perfectly possible to purchase all European language versions. There may be delays as the distributor only carries, at most, 1 or 2 copies of each language (an indication of predicted sales) and will only restock from manufacturing on demand. All versions with the exception of Japanese are, however, available
Graham
www.itspeaking.co.uk
Itamar
Itamar Even-Zohar
itamar@even-zohar.com
Graham,
Thanks for the information. In the UK, the prices are always much higher than anywhere else, though, so I have been trying to avoid it. As I always buy the Dutch package, a very reliable store in the Netherlands provided me with this package and with the Spanish package. I still believe that a more flexible architecture and a more flexible sales policy will enhance sales and revenues for Nuance.
BTW, have a look at my report on multiple languages in the latest Windows Vista build (RC2): it's now closing the gap with DNS and deserves to be tested.
Itamar
Itmar, I haven't explored
Itmar,
I haven't explored current DNS 9 pricing in the EU. Partners were told by Nuance that the reason for the £83 SRP "price hike" of DNS 9 Professional was to bring UK prices in line with the EU where previously our prices were lower! Many UK Nuance Partners do discount the SRP but for language versions other than English there are no economies of scale. I have, for example, only had two requests for EU language versions of DNS 9 since launch. Perhaps a sad reflection on the multi-lingual requirements of UK users.
Graham