Automated Translation for Technical Documentation—Can it Deliver on the Promise?

Home/Publications/Best Practices Newsletter/2009 – Best Practices Newsletter/Automated Translation for Technical Documentation—Can it Deliver on the Promise?

 

CIDM

October 2009


Automated Translation for Technical Documentation—Can it Deliver on the Promise?


CIDMIconNewsletter Sophie Hurst, SDL

In the past few years, we have seen a growth of interest in using automated translation in a business environment. Until recently, most automated translation implementations have been in government and defense areas, but interest has gradually been rising among corporations that see the value it can add to their organizations.

This article looks at the different uses of automated translation, how it is adding value to technical publications, and how your teams can prepare content for automated translation. You can directly impact your company’s strategic objectives of reaching global markets faster. Provided that the software you choose can handle the formatting tags correctly, XML and DITA are well suited to automated translation because the formatting is separated from the words.

The History of Automated Translation and Where We are Today

Automated translation had its debut in the 1950s. A computer operator entered Russian text into an IBM mainframe computer. A little later, the printer started producing fully understandable English content. News travelled fast; there was much excitement about how automated translation could be used in the future. Over time, technology development and budgets to support that development slowed, but now with the power of computers and renewed investment in this technology, automated translation is finding its place in the business environment.

Recent research by different organizations clearly points to an increased interest in the business applications for automated translation. SDL conducted a survey at the end of 2008. You can read the research results on the SDL website. The survey was completed by over 360 people from global organizations. When asked how likely respondents were to use automated translation now, compared to two years ago, two thirds said they were either more likely or just as likely versus two years ago to use automated translation. That is a substantial industry shift and shows quite a significant development of interest and trust in this technology and its uses.

Different Technologies for Different Purposes

We have different types of automated translation technologies and different ways of building a business solution using the technologies. Most of us are aware of “gist” translations, where the output from the automated translation software gives you a basic understanding of the content in another language. For many people, their only experience of automated translation is the free solutions on the web, which can sometimes produce interesting results.

In a business environment, the software can be “trained” to improve the translation quality for different types of content that are relevant to a specific organization. The resulting increase in quality has enabled many companies to start using automated translation for support content, so information that would have otherwise not been translated is now available to help customers receive support in their own languages.

Following best practices, organizations take a step beyond basic machine translation to achieve higher quality. Humans do what is called “post-editing” to correct the output from automated translation and make it as good as a human translation. Many organizations such as Chrysler, RS Components, and Renault have successfully increased translation productivity by using post-editing for technical documentation. They have reduced the time and cost of delivering multilingual content while still delivering the quality required by adding the human element to an automated translation process.

Statistical and Rules-Based Systems

Several methods are available to support the automated translation of words. The two key technologies use either statistical methods (Statistical Machine Translation) or rules and grammar methods (Rules-based Machine Translation). Combinations of the two are also evolving.

The statistical method uses probability to create a translation. If, for example, two or three words frequently occur in the same order in the source and translated content, it is likely that they are being translated correctly. The rules-based method uses software that understands a language and its grammar to make a translation. There are advantages and disadvantages of each system in different applications; the choice depends on your particular requirements, but that is a topic to be covered another time.

Whichever technology you choose, the best results come by fully integrating automated translation into your existing business processes for delivering multilingual content. You should combine it with translation memory and translation management and also prepare for it within your authoring processes.

So How Does All of this Effect Content Creators?

If your company might use automated translation in the future, it is important to start preparing now. As we know, the better the source language in consistency and quality, the better the output in different languages. Automated translation is particularly sensitive to source quality. You can take several actions when writing to improve source quality. A wide range of technology can also help you be successful.

When writing any content for a global audience, you must consider that audience from the start. You must avoid referencing information that is relevant only to the area or country you live in. For example, holidays such as Thanksgiving or Christmas are not celebrated everywhere. Date formats are often specific to different countries. Being aware of the differences or using technology to keep your writing consistent is important.

Grammatical correctness is also helpful not only for automated or general translation, but, as you know, it is best practice. Avoiding ambiguous pronouns or multiple successive nouns can help a machine understand how best to translate. Ensure that articles are used so that words that are homographs are clearly defined as nouns or verbs. For example, “place” can either mean “to place” or “a place” depending on the context. Where a human can usually work out the context, a machine has more difficulty. So, don’t leave out the definite articles (a, an, the).

Managing terminology from the start is also an excellent process to institute within your organization. Terms are important for your company and their use affects both customer satisfaction and the customer’s perception of your brand. Defining terms, storing them centrally so that everyone in the organization can access them, as well as ensuring that technology automatically alerts content creators when they are using them incorrectly, will help ensure consistency throughout your content.

You can use a combination of both process and technology to help you on your way. Many organizations are becoming more experienced and successful with automated translation solutions. There are an increasing number of case studies from which you can learn how to start applying the technology to your organization.

The Impact on the Organization and Its Customers

Automated translation has promised much over the years but only now is it starting to deliver on its promise. As the volume of content continues to grow and as companies operate on an increasingly global scale, automated translation provides many opportunities to meet new business requirements. Integrating automated translation into your business processes for translating support content and technical documentation will lead to the greatest success. Ensuring that you use the right process for the output you require is crucial and the right process is not simply a strategy for the translation process. The right global content creation process can be leveraged from authoring through to translation and publishing to reduce the time and cost of delivering the best quality content for worldwide customers. CIDMIconNewsletter

About the Author

Hurst_Sophie

Sophie Hurst
SDL
shurst@sdl.com

Sophie Hurst is the Director of Product Marketing at SDL, responsible for all their language technology product lines, which help improve the end-to-end process of delivering global content. She speaks 5 languages, is a member of the Chartered Institute of Linguists and during her experience at various IT companies has gained an excellent understanding of the cultural, linguistic and business challenges faced by organizations doing global business.