Dynamic Content Delivery Using DITA

CIDM

June 2007


Dynamic Content Delivery Using DITA


CIDMIconNewsletter Eric Severson, Flatirons Solutions, Inc.

This article defines a new publishing paradigm, which we will call dynamic content delivery. Dynamic content delivery changes the rules, putting the reader in charge of what content is important and how it should be packaged. It transforms publishing to an audience of many to publishing to an audience of one.

Static vs. Dynamic Publishing

DITA is the hottest thing to have hit the technical publishing world in a long time. With its topic-based approach to authoring, DITA frees us from the need to think in terms of “books”, and lets us focus on the underlying information. With DITA’s modular, reusable information elements, we can not only publish across different formats and media, but also flexibly recombine information in almost any way we like.

Initial DITA implementations have focused primarily on publishing to pre-defined PDF, HTML, and Help formats—that is, on static publishing. But the real promise of DITA lies in supporting dynamic, personalized content delivery.

The classic static publishing model is a set of specific publications created and produced to meet the needs of a wide target audience. Whether we are dealing with technical documentation, policies and procedures, courseware, or marketing materials, this model involves central decisions on the appropriate content and packaging, followed by distribution en masse to the intended audience.

The limitations are obvious. In technical publishing, we often see examples of “one size fits all” manuals that cover so many options and variations that all readers are confused. Or, the same information is recombined across 20 or 30 manuals that are optimized for a particular target audience—the end result is that there are still gaps in meeting each reader’s needs. For novels and literature, that may not matter—but for technical material and professional publications, it can have a significant impact on your clients, employees, partners, and vendors.

Our proposed alternative model is dynamic content delivery. This approach involves “pulling” rather than “pushing” content, based on the needs of each individual user. In this paradigm, there is no central editorial decision that determines what content is appropriate for a particular user. Instead, the reader uses a dynamic search to choose the content that he or she considers relevant. The editorial process only determines the raw material—or pool of information—that’s appropriate for each subject area, from which the reader is allowed to choose.

In some applications, it may even be possible to automatically choose the appropriate content, based on a reader’s individual profile. For example, for a product installation manual we don’t need (or want) to ask the users what content they’re most interested in. If we know their user profile and the product they bought, we can automatically tailor the manual to fit their situation.

DITA: Dynamic Assembly of Topics

Dynamic content delivery is made possible by the technique known as topic-based authoring. A topic is a piece of content that covers a specific subject, has an identifiable purpose, and can stand on its own (i.e., does not require a specific context to make sense). Topics don’t start with “as stated above” or end with “as further described below,” and they don’t implicitly refer to other information that isn’t contained within them. In a word, topics are fully reusable, in the sense that they can be used in any context where the information provided by the topic is needed.

DITA (Darwin Information Typing Architecture) is a standard that was designed specifically for topic-based authoring. DITA doesn’t focus on documents, but rather on independent, searchable topics that can be freely combined into documents, assemblies, or collections, described by a DITA map. Figure 1 illustrates this approach.

Severson1

Reusing Topics Even When There are Variations

Although topic reuse is a very powerful concept, in practice it turns out that most topics contain some necessary variations that make them almost 100% reusable, but not quite.

Take a classic example from technical publishing. Let’s suppose that a certain server-based utility is used in many different situations and operating environments and is therefore described redundantly across many different technical manuals. With DITA this scenario presents a perfect opportunity to create a set of reusable topics that can be automatically incorporated into all the various places the utility must be described. For the introduction to the utility (“About the Server Utility”), this solution is straightforward—the same introductory text can be reused in all contexts. But for instructions to run the utility (“Run the Server Utility”), the situation is not so clear-cut. In this case, although most of the text is redundant, we really do need to insert different detailed instructions for each operating environment.

This kind of variation could be handled by creating parallel topics—one for each operating environment—but a better way is to create one topic with conditional or profiled text. In this case, attributes within the XML text indicate the scenario or profile for which each variation is appropriate. Both a Windows and a UNIX version of a document can be built from the same DITA map.

Using DITA to Simulate Dynamic Content Delivery

In a static publishing scenario, DITA maps are created in advance for each pre-defined publication type (documents, web pages, help systems, and so forth). Each map contains references to the topics that should be included for that particular publication.

To avoid the “one size fits all” problem, DITA offers the flexibility to create a different map for each personalized scenario you wish to support. For example, rather than publish one book that covers multiple products, we could have a different DITA map for each. Each map would reference the topics appropriate for its product, and these topics could be profiled to select the material applicable to a specific platform (e.g., Windows vs. Unix).

In theory, extending this approach could get us closer to dynamic content delivery. To implement this approach, all of the combinations can be pre-generated and stored with metadata that indicates the combination each represents. For example, the manual for Product A, Unix, Novice Edition would be stored with these three parameters as metadata. At content delivery time, the system could dynamically match this metadata with the user’s profile (Product=A; Platform=Unix; Audience=Novice) and select the appropriate “personalized” edition.

The problem with this scenario, of course, is that the number of combinations increases exponentially as the number of variations increases. In a real-life example, there may be not eight but more like 40,000 combinations.

Storing that many variations of the manuals on disk may not be a problem, but maintaining all the DITA maps might. The real problem, though, is the complexity of the pre-generation process. To actually keep all the generated content up to date, all the affected outputs have to be re-generated every time a piece of source content changes! This approach is not scalable.

Real-World Example:
Personalized Installation Manuals

A large manufacturer of computer storage hardware offers many product variations and a multitude of options. The related technical documentation is both complex to use and cumbersome to produce. With a Dynamic Publishing solution, users can enter their particular profiles and receive dynamically assembled technical manuals optimized just for them. The manufacturer thus finds it feasible to sell products to a wider, less-sophisticated customer base, minimizes service calls, and lowers publication costs.

A Scalable Approach to Dynamic Content Delivery

Because of the scalability issues when outputs are pre-generated, we recommend dynamic generation of output on-the-fly. Because DITA encourages the development of reusable, standalone topics—modular information that can be flexibly recombined—it’s the perfect standard for this purpose.

As shown in Figure 2, with this approach, content is pushed to the delivery platform directly in XML. There, XQuery-based searches find relevant topics, filter or “profile” topics so that only applicable content is included, and dynamically transform DITA XML into the desired output format (e.g., HTML and PDF).

Severson2

We chose XQuery as the appropriate search technology, not only because it’s the standard of choice for searching XML content, but also because it has the built-in ability to assemble and transform search results into alternative output formats. While doing so, it can also profile the results so that only applicable content is included.

DITA maps, which are pre-set in a static or pre-generated delivery environment, are also constructed dynamically. Such dynamic maps can either be created by the user or generated automatically by the system. Using XSL stylesheets—which can optionally be selected by the user—results can be previewed or published in HTML and PDF.

Two kinds of dynamic content delivery interfaces are possible:

  • Information Shopping Cart Interface—in which the user browses or searches to choose the content (DITA Topics) that she considers relevant and then places this information in a shopping cart. When done “shopping,” she can organize her document’s table of contents, select a stylesheet, and automatically publish the result to HTML or PDF.

This approach is appropriate when users are relatively knowledgeable about the content and where the structure of their output documents can be safely left up to them. Examples include engineering research, e-learning systems, and customer self-service applications.

  • Personalization Wizard Interface—in which the user answers a number of pre-set questions in a wizard-like interface, and the appropriate content is automatically extracted to produce a final document in HTML or PDF.

This approach is appropriate for applications that need to produce a personalized but highly standard manual, such as a product installation guide or regulated policy manual. In this scenario, the document structure and stylesheet are typically preset.

Real-World Example:
Dynamic Policies and Procedures

In financial services and many other industries, policies change frequently and have different applications depending on geography and other variables. Static policy manuals don’t cut it—they’re forever out of date and can be hard to interpret. With a Dynamic Content Delivery solution, up-to-date policies and procedures can be served up for exactly the user’s need and environment. This increases productivity, decreases liability, and makes it much easier to move into new markets and geographies.

Our Dynamic Content Delivery Solution

Of course, our recommended approach is only feasible if the XQuery engine is extremely fast. That’s why we’ve built a Dynamic Content Delivery solution around Mark Logic, an XQuery-based content delivery platform optimized for real-time search and transformation. Specifically designed for scalability and performance, Mark Logic can provide millisecond response times against multi-terabyte DITA content bases.

As shown in Figure 3, our Dynamic Content Delivery solution includes a pre-built configuration of Mark Logic that supports user-friendly search and browsing for DITA topics and user-friendly editing of output document structure, using either the “Information Shopping Cart” or “Personalization Wizard” paradigm. This interface is designed to be readily enhanced to meet customer-specific needs.

Severson3

Optionally, the solution also includes our Mark Logic Connector, which supports continuous or batched update of Mark Logic, synchronizing it with new, modified and deleted source content maintained in Documentum.

A Complete DITA Publishing Solution

Dynamic content delivery does not replace the need for static publishing. In a complete solution, both have their place. Dynamic content delivery becomes another channel for the dissemination of technical information, supplementing traditional PDF, HTML, Help, and print outputs.

Figure 4 shows how static and dynamic publishing complement each other in a complete DITA publishing architecture:

Severson4

In this case, the architecture uses both Flatirons Solutions’ Dynamic Content Delivery Solution built on Mark Logic, and our standard DITA Technical Publishing Solution built on Documentum.

Real-World Example:
e-Learning Systems

A corporate training organization offers many courses and educational resources, but courses still have to be customized for each client and purpose. This makes them more expensive and less accessible. With a Dynamic Content Delivery solution, clients can search for the modules and materials relevant to them, and dynamically assemble personalized courseware. This solution increases customer satisfaction, reduces customization and delivery costs, and drives more revenue through the training organization.

Conclusion: Bottom Line Benefits

With a dynamic content delivery approach, content delivery is transformed from a static, universal approach to a highly personalized and relevant interaction with “an audience of one.” This approach avoids the pitfalls of “one size fits all” delivery, and allows enterprises to meet specialized customer needs while actually reducing costs.

The resulting business benefits are many:

  • Increased product sales
    • Based on enhanced customer productivity/satisfaction
    • Based on more effective field support
  • Ability to move products into less sophisticated target markets
  • Reduced publication costs
  • Reduced content maintenance and review costs
  • Reduced cycle times
  • Reduced technical support calls

We’ve described a dynamic content delivery solution that leverages DITA standards with powerful search and retrieval capabilities. The result is a level of customized and personalized content that is not possible to achieve through traditional delivery systems. CIDMIconNewsletter

About the Author

Eric Severson b&wh

Eric Severson
Flatirons Solutions, Inc.
eric.severson@flatironssolutions.com

A recognized XML pioneer and content management industry expert, Eric Severson is a member of the IDEAlliance Board of Directors, a past President of OASIS, a principal designer of the IBM XML Certification Program, and a frequent speaker on DITA-related topics. With over 20 years of industry experience, Eric has held senior management positions in both engineering and marketing roles, worked in Big 5 and IBM consulting practices and is the founder of a successful XML software company. As a founder and CTO of Flatirons Solutions, Eric leads an experienced consulting and systems integration practice specializing in XML-based publishing and content management solutions.