How to Publish Dynamic Documents More Easily

CIDM

October 2004


How to Publish Dynamic Documents More Easily


CIDMIconNewsletter PG Bartlett, Vice President of Marketing, Arbortext

To respond to their customers’ demands for fresher, more relevant information, organizations are building content management and publishing systems that produce customized documents on demand. Customers get all the information they need and only what they need, and it’s fresh and accurate at the moment the need is fulfilled.

This approach to delivering information, which we call “Dynamic Publishing,” turns the traditional approach to publishing on its head:

  • Instead of creating information with the formatting embedded, you create information in a form that is independent of any specific medium such as Web or print.
  • Instead of creating documents, you create information modules, which we refer to as “components.”
  • Instead of creating a single document to serve multiple audiences, you set up a system to deliver highly tailored information to each member of the audience.
  • Instead of retyping data into a document, you embed a pointer to that data so that the publishing system can update the data automatically at the time of publishing.

The most advanced dynamic publishing systems can pull information from content repositories, relational databases, and other business systems, assemble that information into documents, and automatically publish those documents on a variety of media: Web, print, HTML Help, CD-ROM, and even wireless.

This article explores the challenges of building a dynamic publishing system and introduces the Dynamic Content Assembly Manager (DCAM), a new product from Arbortext that addresses a critical need in such a system.

The Easy Part

To implement a dynamic publishing system, the decision to use XML is the easiest one to make because it’s the only way to achieve several critical design criteria of a dynamic publishing system:

  • Format-free content-Because XML represents only content and not formatting, you can publish the same document automatically to different types of media, such as Web and print, without any manual intervention. This eliminates the massive human effort that normally goes into decorating documents, including designing, formatting, and laying out pages. In addition, you can easily incorporate the same XML component in different documents, even if the formatting of that information changes for each of those documents, thanks to automated formatting and publishing. This saves time and effort in two ways: first, by allowing multiple documents to use the same content without manually reformatting that content to fit each document, and second, by allowing writers to update all documents that incorporate a component by changing only the component.
  • Modular information-Because XML information must conform to a structure, often called a “data model” (which is described by a DTD or schema) that you prescribe, you can create a “building block” approach to your information. In such an approach, you define which information you will create in separate components, which components may contain other components, and so on to ensure that all the pieces fit together seamlessly, without requiring manual adjustments to make them work. Without XML’s enforcement of absolute consistency in your information, there would be no way to automate the assembly and publishing functions.
  • Dynamic assembly-Thanks to the features of XML listed above, publishing systems can assemble XML components into complete documents automatically with the assurance that the pieces fit and that they can be formatted appropriately. In some cases, this may be the only way that organizations can economically deliver customized information to their customers; in cases where organizations are required to customize their information deliveries, this can vastly reduce the effort to create and update those customized documents.
  • Represents all types of data-With the wide adoption of XML in almost all types of software, publishing systems can automatically extract information directly from its source instead of requiring manual intervention through re-typing or copying and pasting. This allows you to take a “single source” approach to information, which reduces the cost of updating information and assures the accuracy of that information.

The Hard Part

Despite all these virtues of XML, there remains a critical hole in the solution, and that hole becomes obvious only when you start implementation. Challenges arise because components are not just chunks of content; instead, they often contain hyperlinks, cross-references, and similar objects that involve relationships with other components.

Maintaining the validity of such relationships is challenging enough when you’re creating static, standalone documents. But if you’re creating documents dynamically to deliver highly customized information to your customers, where you can produce almost limitless combinations of content components-and where even the links themselves can change dynamically-then it’s difficult to guarantee that all links work properly. The challenges can be complex:

  • What should happen if a link target doesn’t exist in the assembled document? For example, topic A contains a link to topic B but topic B is missing.

October04a

  • Should an author be prevented from linking to a component that doesn’t exist? What if the author wants to link to a component that will exist in the future?
  • Should an author be allowed to delete a component that is the target of a link (which would invalidate that link)?
  • What should happen if a component that is the target of a link appears more than once in the resulting document? For example, if topic B appears twice in a document and if topic A links to topic B, which instance of topic B should be the target of topic A?

Bartlett fig 210

  • Within a repurposable component, how can an author create a link that varies depending on the parent in which it appears? For example, how can an author set up a link in topic A that points either to topic B or topic C depending on the parent document of topic A?An example of this would be a battery service procedure that contains the same content except for the battery number, the picture of the battery, the table with the battery’s specifications, and the included battery testing procedure. Service manuals for different products could all incorporate the same battery service procedure, with some mechanism to choose the right content within the battery service procedure depending on the product.

Bartlett fig 312

  • What should happen when a repurposable component appears in a parent for which it was never designed? For example, what if a battery replacement procedure appears in service guides for a product with two models, one with a battery and one without?

To date, addressing these challenges has required considerable process design and custom software development. In early 2004, Arbortext delivered a new product, DCAM, to provide the infrastructure and tools to help organizations address these issues.

Arbortext’s DCAM

As part of the release of Arbortext 5 in March 2004, Arbortext is offering a new product, Dynamic Content Assembly Manager, that’s designed to help users create and manage the relationships among document components as described in the previous section. DCAM is an add-on option to Arbortext’s E3 publishing server.

DCAM offers functionality that is critical to reusing, repurposing, sharing, and improving information that exists in reusable components. This functionality is especially important for large documents, large collections of documents, and documents delivered on multiple types of media.

DCAM is a client-server application that supports authoring, management, publishing, and distributing content:

  • Authoring-DCAM provides a user interface that allows authors to create, validate, and manage cross-references, hyperlinks, media links, and navigational links. As an author works, DCAM ensures the validity of component relationships, including new and existing links.
  • Management-DCAM keeps track of relationships among components in multiple documents, in multiple locations within a single document, in dynamically assembled documents, and in subsets of master documents.
  • Publishing-During the publishing process, E3 calls upon DCAM to validate and resolve links to their final target location and media type. DCAM not only ensures that the resulting documents contain valid links, but also adds important functionality. For instance, during the publishing process, DCAM resolves “variable links” that change based on how a component is used, in support of repurposable content as described earlier in the battery service procedure.

To learn more about DCAM and its support of dynamic publishing, please go to the Arbortext Web site: www.arbortext.com. CIDMIconNewsletter

About the Author

bartlett bw crop5