Translation Models: A 180° Change
If your organization spends thousands or millions producing manuals, reports, and online help documentation, you should be alarmed. Chances are that your customers never read your high-quality documents.
Users are taking charge. Instead of passively consuming the information sent to them, they are searching the Internet to find the exact data that meet their needs. Users, citizens, patients…they no longer take “no” for an answer and want the facts now. Information traffic is changing dramatically, and traditional publishing models no longer suffice. The publisher-centric model is being replaced by the user-centric model. The one source of information provided by the product manufacturer, the government, the doctor, or the hospital is now being replaced by dozens—if not hundreds—of alternative and competing information sources. Tips and tricks from other users, prescriptions from multiple healthcare organizations, and analyses of government data from private sources, may be much more valuable than the “authoritative” information from the original publisher. The data-centric model is being replaced by the network-centric model.
This information revolution led by Google has entered our everyday life at work and at home. We have become used to pulling information from our favorite sites when we need it. The direction is changing from push to pull. Or, in technical terms, from the dissemination model to the assimilation model. The power and consequences of this information revolution cannot be underestimated. Documentation managers are under pressure to shorten cycle times. Increasingly, they find themselves in competition with internal customer service organizations that provide instant access to information through call centers and online knowledge bases. What started off as a suggestive trend is now turning into a serious business model, often referred to as web-based customer self-service. As always, customer self-service costs less for the “shop-owner.” And the small shops will follow the big ones or go out of business.
Localization Manager Left on the Sidelines
In this information turbulence, localization managers feel left on the sidelines. Whereas time was always in short supply when producing a decent translation of a manual in the ”old” way of doing things, it is now virtually disappearing. Besides, the decision as to what to translate and what not to translate is becoming more difficult. The ocean of information keeps growing and there is no easy profile of the “user.” Users have found their own way to automatic translation tools on the Internet. Millions of them press the Yahoo and Google translate buttons everyday in an attempt to understand web pages and email messages written in a language they do not know. The translation model is changing 180°.
Microsoft Machine Translation
Web-based customer self-service in multiple languages calls for new technology and a level of automation that is both rare and revolutionary in the conservative translation market. But several user cases prove that it works. Microsoft has reported successes in deploying a home-grown machine translation system for real-time delivery of articles from its knowledge base in the native language of the Microsoft customer. The knowledge base contains more than 140,000 product support articles, adding up to 80 million words of text—in English only, of course. As reported at the AMTA (Association for Machine Translation in the Americas) conferences in 2002 and 2004, the automatic translation of support articles gets high scores from the many thousands of users. According to the report in 2004, overall customer satisfaction for the Spanish translations was as high as 86 percent. Spanish users were so happy to be able to access the information in their own language that they were willing to overlook the fact that the quality was lower than that of human translations. The “usefulness” rating of the Spanish machine-translated articles was reported to be 50 percent, compared to 54 percent for the English language articles. Usefulness is defined here as the percentage of customers feeling that an article helped solve their problem.
Bill Gates: MT “big breakthrough”
These successes, notched up by the Microsoft Natural Language Processing Research group, were significant enough for Bill Gates to list machine translation as one of the “big breakthroughs” in the next decade.
By having that kind of translation capability, it means that even people who don’t speak English will now have the full breadth of all the material out on the web that they don’t have access to. And that translation technology will be applied to more and more language pairs, more and more domains, so over the next decade we’ll kind of think back and say, yes, there was a time where that couldn’t be done. There are a few domains, like translating poetry, that perhaps will never be very good, but for most informational content on the web within this time frame it will be quite excellent, and a very plausible thing there. (Bill Gates, quoted from a speech at Princeton University in New Jersey on October 14, 2005)
User-driven translation model
Microsoft is not the only company moving towards the “user-driven translation model.” Many companies provide an automatic translation feature on their internal networks to allow their international work force to translate company information and training material, communicate better, and share more information. DaimlerChrysler offers automatic translation between English and German for its Detroit and Stuttgart operations. Volkswagen provides an automatic translation feature between German, Spanish, and English. The automatic translation facilities on company intranets (compared to the publicly available “translate buttons”) usually return better quality translations as a result of some limited customization with company-specific terminology. The other advantage is that emails with confidential company information will no longer leave the protected company networks for an automatic translation via one of the public Internet portals.
In a world overflowing with information, the decision as to what to translate and what not to translate becomes harder and harder. The most pragmatic approach to this localization dilemma is to translate the bare minimum upfront and to let further translation efforts depend on what customers really need. The bare minimum translation effort may be dictated by regulatory and government requirements, such as labeling on medical devices. To better respond to the information and support requirements of customers and citizens, companies and governments are finding a refugee in “e-programs” such as e-support, e-health, and e-government. Sophisticated software tools are deployed to channel information searches, manage question-and-answering, and prepare the required information as smoothly as possible.
In a variation on the Microsoft model, CNH—a global manufacturer of agricultural machinery—uses machine translation to convert incoming support requests in multiple languages into English to allow their engineers to respond quicker. Fixes to problems are documented by the CNH engineer, then machine-translated, and, after a ”light” post-editing, sent back to the dealer. SAP has also started to use machine translation to translate support tickets before routing them to the worldwide support centers.
Traditional translation too costly
In a recent paper presented at the Centre for Translation and Textual Studies of the Dublin City University, Symantec reported on its tests to deploy controlled language rules in combination with machine translation to allow its worldwide subscribers to receive translated virus alert notifications with a minimum delay. The traditional translation workflow would be far too costly to handle this information flow. But, more importantly, the inevitable time lapse in a conventional translation process between the English and translated alert notifications would make the information obsolete and useless to the subscribers. Initial results show that with light, rapid-turnaround post-editing, the machine translation output scores an accuracy level of around 90 percent. According to a recent survey among MT practitioners, time and cost reductions as a result of the use of MT and post-editing services are on average 30 to 40 percent (TAUS Report on Best Practices in Post-Editing Machine Translation, February 2006).
The localization dilemma grows yet another dimension in the world of intellectual property and patents. Imagine that you have invented something and you want to check whether something similar already exists; you will have to search databases with millions of patent documents in multiple languages. This task is not humanly possible. Following the example of the Japan Patent Office, the European Patent Office has decided to make machine translation directly available to the general public. The system will be customized with hundreds of thousands of technical terms extracted from the many different domains covered by the International Patent Classification. In comparison to translation automation initiatives in the commercial sector, this is a very ambitious project. But as evidenced by the success of the JPO, the results will be enormously beneficial. It will generate big cost savings and help accelerate innovation on a European and worldwide scale.
Shake off old-time principles
The localization dilemma will lead to a rapid increase in tests and real user cases for the user-driven translation model. It is a natural consequence of the mega-trends governing the global information society. What these tests and real cases keep telling us is that the rules of information publishing are undergoing a fundamental change. We cannot implement a user-driven translation model while still applying old publisher principles. Publishers insist on High Quality Translation, but users are happy enough with Useful Translation. The old-time publisher wants to control every word and every sentence. The user would be perfectly satisfied if the publisher would merely provide an adequate translation infrastructure.
The translation model is changing 180°. The degree to which we are ready to respond to user requirements depends on whether we are able to shake off traditional ways of thinking. Rather than controlling every step in the process, we need to facilitate a translation infrastructure that works well enough for citizens, users, patients—what we could call “pretty good translation,” modeled on the concept of “pretty good encryption” which emerged at the beginning of the web. That said, we will of course continue to translate minimally required user interfaces in a fully controlled process.
FAUT (Fully Automatic Useful Translation)
In contrast with the defeat of FAHQT (Fully Automatic High Quality Translation) in the 1960s, a new movement has come that we baptize as FAUT (Fully Automatic Useful Translation). The new FAUT vision is gradually replacing the old preoccupation with the language barrier. The difference is in how you look at it: is the glass half full or half empty. The fact is that translation automation is progressing rapidly with the implementation of translation workflow systems, the centralization of translation memories, and the development of new statistical machine translation systems. With Microsoft, Google, and Yahoo joining the machine translation development community, innovation and improvement will accelerate exponentially. The only way to leverage and benefit from this progress is by sharing best practices and user cases, perhaps even sharing linguistic data like terminology and translation memories. This is the TAUS Vision that our Founding Members subscribe to.
About the Author
Jaap van der Meer
Jaap van der Meer is founder and director of TAUS (Translation Automation User Society). He is a veteran in the translation industry, started the translation company INK in 1980 in The Netherlands and was CEO and President of ALPNET in the nineties. He was one of the founders of the Localization Industry Standards Association (LISA). He is a member of the the Localization World management team and a regular speaker at conferences.