Eric Severson, Flatirons Solutions
In the two prior articles of this series, we looked at DITA best practices from the author’s and content manager’s point of view. This included consideration of what it takes to think in terms of topics rather than books, how to manage content variations and reuse, how to optimize the use of DITA content models, and how to implement topic-based review and publication processes.
In this final article, we’ll turn to the globalization manager’s point of view–looking first at how DITA can be used to minimize translation costs and then how to go beyond translation to fully globalize your DITA-based content.
Minimizing Translation Costs
In a multi-language environment, language translation can easily be the most expensive part of producing new and updated content. It’s also often at the root of updates coming out late.
In general, two things drive the cost and complexity of translation:
- the number and types of languages being supported
- the amount of content–or more precisely, the number of unique sentences–to be translated
DITA can’t change the first factor, but it can have a major impact on the second. To understand why, let’s look a little deeper at how translation works. In any modern translation process, special software called a translation memory examines all new content to see which sentences have already been translated. Only sentences that haven’t been encountered before are sent for translation and incur translation cost.
This process means that not only is unchanged content ignored but so is any redundant content in the same document or across other documents in the set. For example, the standardized phrase “Company XYZ makes no warranties and will not be held liable”—even though it may occur hundreds or thousands of times in the content—is translated only once.
The problem, though, is that supposedly redundant content is often not quite redundant, and even one word’s difference will cause sentences to be viewed as unique and separately translated. Consider the following four paragraphs, each explaining the same concept but separately maintained by different authors:
Postal code denotes the specific mail delivery region. Postal code is required.
Postal code is defined as a specific mail delivery region. Postal code is a required field.
The mail delivery region is indicated by the postal code. Postal code is a mandatory field.
Postal code is used for the specific mail delivery region. Please note that postal code is a required field.
This example results in eight unique sentences that must be translated. If instead, this text were maintained as a standard DITA topic and reused across each of the four contexts, then only the two sentences in the common topic would need to be translated. Reuse would result in a 75% savings in translation costs! Imagine that applied across your whole body of content!
Applying the Same Idea to Common Terminology
The same idea can be applied to managing the standard terminology that’s included in content such as company names, product names and features, legal terms, and so forth. By using DITA keyword elements to represent these terms, rather than “hard-coding” the actual text, changes to these terms can be managed from a single place and automatically rippled through content. By mapping these DITA keywords to a translation terminology base, only the changed term needs to be re-translated—not all of the individual sentences.
This technique needs to be used cautiously, however. Especially when translating to inflected languages (such as most Slavonic and Germanic languages), the indiscriminate use of replacement terminology can result in ungrammatical translations. The OASIS DITA Translation Subcommittee recommends1 that direct replacement be used only for things like product names that are not translated—or for terms that are the subject of the sentence and in the nominative case.
A Case Study
The cost savings associated with these approaches can be dramatic. As reported by my colleague Ben Martin2, J.D. Edwards was able to save over $350,000 after the former one-off technical manuals were transformed into a set of shared, reusable topics:
Note that over $200,000 of the savings was in translation costs alone. This savings represented a whopping 70% reduction in translation and localization expense!
The remaining cost savings were distributed as follows:
- $96,000 in authoring costs (90% reduction)
- $29,000 in editing costs (70% reduction)
- $24,000 in QA and production costs (90% reduction)
When terminology is standardized, translation costs can be even further reduced. For example, J.D. Edwards found that the following relationships were true:
Globalization as More than Translation
Finally, it’s important to understand that globalization means more than language translation. For example, while nutritional labels for the U.S., the U.K., and Australia may all be in English, each country has its own very different legal and regulatory requirements. In general, all of the following can play into a complete globalization strategy:
- market differences
- product line differences
- cultural differences
- legal differences
- regulatory differences
In DITA, these differences could be reflected in separate topics that vary by geography, each of which is translated into the appropriate language(s) for that geography. Or, if the variations are small and pinpointed to certain areas of the content, they might all be included in the same topic. In this case the topic would probably be translated into all languages but would be filtered for the specific content relevant to each geography. Practical situations would most likely include a mixture of both approaches.
Here’s an example showing how all of this might come together for a product label:
- Product Description: Used in all geographies; topic translated to all languages
- Product Usage Information: Depends on cultural differences; each topic translated only to the relevant languages
- Nutritional Information: Depends on each country’s laws and regulations; each topic specific to a country and language
We hope you’ve enjoyed this three-article series on DITA best practices and have found it useful in helping you both understand and use DITA in your organization. Hopefully, the ideas will also help you achieve the kinds of dramatic cost savings that we’ve outlined in these articles. Good luck, and of course, don’t hesitate to ask for expert help if you need it!
Flatirons Solutions specializes in content management and XML-based publishing and is located in Boulder, Colorado. Eric is a co-founder and Chief Technology Officer.
1 See Zydron, Andrzej. 4 March 2008. An OASIS Whitepaper: Best Practice for Using the DITA CONREF Attribute for Translation.
2 Ben is currently Director of Business Architecture at Flatirons Solutions and was formerly VP of Content Management at J.D. Edwards.