Bill Rabkin
Idiom Technologies, Inc.

Today, many global organizations publish information such as technical documentation, online help, and web content in many languages. Often, they experience a delay between the availability of content in its original language and the availability of that same content in other languages. We call this delay the Globalization Gap.

This gap results from the approach to content creation that many organizations use. They complete documents in a first language and only then do they begin translation into other languages. Many organizations direct their initial translation efforts to a small number of “Tier 1” languages, deferring translation into other “Tier 2” languages until they have completed this first set of translations. Information may not become available in all languages until months after the original content is released.

The globalization gap causes a delay in availability of new or updated product releases in international markets whose language differs from the one in which the original material is written. This delay can have a significant impact on international revenues.

A one-billion-dollar multinational software company in the United States reported that they typically experienced a 3- to 6-month delay in delivery of new releases of their products to European and Asian markets. During this delay period, they estimated that at least 10 percent of their international prospects waited until the new release was available in their native languages before making a purchase, and that at least 1 percent purchased a competing product that was already available in the desired language. As a result, the company estimated that some US$11 million in revenue was delayed or forever lost.

Closing the globalization gap delivers products of higher quality to market more rapidly, at lower cost, affording the opportunity for accelerated international revenues. To close the globalization gap, we believe that companies must implement a centralized system for authoring, globalizing, and producing documents that is based on modern XML standards.

With an integrated, server-based system that manages the entire documentation lifecycle—including globalization—organizations can achieve significant savings in time spent in each phase: authoring and editing, document and globalization management, translation and review, and composition or production. A centralized system also allows organizations to perform each of these operations concurrently, further accelerating the delivery of information in multiple languages and formats.

Let’s explore how a centralized global publishing system raises quality and accelerates delivery. In the past, translation management involved manual operations including many file handoffs among different groups within an organization and between the organization and its external translation vendors. In addition to the actual work of translation and review performed by specialists, it included file analysis and cost quotations prepared by vendors, approval by content owners, project creation and monitoring, composition or desktop publishing (DTP), and manually synchronizing Translation Memory databases between language vendors and their customers.

With a server-based system that provides substantial business process automation, centralized Translation Memory, and built-in workflow and support for authors, translators, reviewers, project managers, and publishers, almost all of these operations can be performed automatically. Only the steps that require human intellectual contribution—information development, translation, and review—remain to be performed by contributors to the effort. Often, it is possible to automate as much as 90 percent of the work involved across the information-development lifecycle.

Key Considerations for Globalization
To minimize the globalization gap, organizations must combine XML documentation management and centralized Translation Memory into a single system that integrates authoring with globalization and automates document composition. In this environment, authors concentrate on information development, not on formatting or presentation. Translators focus on translation, not on localization engineering or DTP. And the system includes automated processes to produce output in many forms in all languages.

DITA, the Darwin Information Typing Architecture, provides a solid foundation upon which to develop technical documentation. In DITA, information is organized into topics; the basic DITA topics are concepts, tasks, and reference material.

DITA facilitates and encourages reuse of content in many ways. Information architects combine topics into various deliverables through ditamaps; some topics may appear in multiple deliverables such as a Getting Started pamphlet, an Installation Guide, and a User Manual. Another form of reuse is single sourcing, which supports production of manuals, online help systems, HTML files, and other deliverables from the same set of XML topics. And the DITA content-reference (conref) mechanism provides for reuse of such standardized text as book titles, product names, legal notices, and other boilerplate information.

Of course, content that is reused needs to be translated only once. With DITA, organizations perform translation on a topic-by-topic basis; there is no need to wait for an entire document to be complete before scheduling translation. “Translate early and often” becomes the norm. With a central Translation Memory that supplies perfect translations of all text that has previously been translated, there is no penalty when content changes. Simply reprocess the modified document. Only the new or altered content needs action by a translator.

With such enormous business benefits, why aren’t more organizations already using DITA?

Establishing the environment for effective XML authoring and publishing requires more than simply obtaining a copy of the DITA schema or Document Type Definition (DTD). A content management system or file-and-directory organization scheme is essential to creating, maintaining, retrieving, and publishing information. Many XML authoring tools now support DITA, but they must be integrated with the content management system. Organizations need XSL transforms and a publishing system to assemble individual XML topic files into production-quality PDF, HTML, and other outputs. Users need query facilities and search templates to locate specific content quickly and easily.

In short, a lot of “plumbing” must be in place before authors write their first topics or convert existing unstructured material to DITA. Experience has shown that an organization can invest as much as 10 person-years or more to develop and test this infrastructure, and many organizations find this too steep a barrier to overcome.

Fortunately, both open source and commercial solutions are emerging to lower the cost of entry into DITA publishing.

For more information about DITA, see For more information about WorldServer Global Electronic Publishing from Idiom Technologies,