JoAnn Hackos, PhD
CIDM Director

Many information-development organizations have begun to see the value of designing their information assets as individual topics that can be reused among any number of relevant final deliverables. Topics provide greater flexibility than long chapters. Topics are easier to design for reuse by structuring content consistently and using standard writing styles. Topics can be used in multiple deliverables with vastly different designs from printed manuals to their PDF versions, from web sites to help systems, from ordinary computer displays to cell phone and other miniature screens.

So—if topics are so versatile in designing and developing technical content, why are so many organizations convinced that they can easily implement a topic-based information architecture by extracting topics from conventionally designed and developed books? Why are content management vendors and consultants willing to support bursting even though the final results have limited value?

I define bursting as an automated process for creating “chunks” of text from documents that were originally designed to be continuous. In most cases, bursting works by reading heading levels. For every heading level selected by the bursting routine, a “chunk” of text is created as an independent file. One large file, usually the chapters of a book, becomes many smaller files, each with its own heading. Instant topics—or so we are led to believe.

The problem with bursting is that you end up exactly where you began. The “chunks” of text represent parts of the original document but rarely do they represent standalone topics of information. When we write chapters, we generally try to thread the text together through a narrative. We refer in the introduction to the coming sections. In the sections, we refer to information that has occurred earlier and will come later. In fact, we write exactly the way we would if we were writing any other book. The pieces are designed to be part of a whole.

That integration of parts makes separating them into what should be standalone topics frustrating. In most cases, the structures are not internally consistent. It may be impossible to assign an information type, such as task, concept, or reference, to the chunks of text. We are more likely to find mixed types in which conceptual information contains tasks and tasks contain references.

In my experience with probably thousands of different manuals written by vastly different organizations, I find that unless information is specifically designed to stand alone, it doesn’t. It is specifically created as a whole, not the sum of parts.

What then is the result of bursting? Usually, a set of files is difficult to recombine into new deliverables because they depend so much on the information around them. I’ve mentioned before my experience discovering that someone creating a product demonstration tried to reuse a new context paragraph beginning with the word “furthermore.” Not only would I find it impossible to incorporate into my writing such a paragraph written by someone else, I probably would not be able to incorporate a “furthermore” paragraph that I had written myself for another context.

When I discuss topic management with groups that have indulged in bursting, they usually admit that they really are getting minimal reuse. Most of the reuse they achieve is traditional. They reuse boilerplate text for copyrights, warning, and cautions more frequently than they are able to reuse anything else. Writers generally point out, usually correctly, that the borrowed chunks of text don’t quite fit in the context they are creating.

My conclusion is that bursting doesn’t work although it gives the appearance of accomplishing a huge task in a short time. The trouble is that you end up exactly where you started, with content that cannot be used in more than one context because it was never structured consistently nor in keeping with structures defined by the requirements of unique information types.

My recommendation, however, is not going to be easy, although it is inevitable if you really want to reap the benefits of a topic-based information architecture. After you have determined what the information architecture should be, you must select some critical, move-forward information that you rewrite according to your new architecture. During the rewriting, you build standalone topics of information with the planned contexts squarely in mind.

If you don’t have enough time in your first project to transform everything, continue to use existing information without ripping it apart. You can create a system that allows you to combine the old with the new. It might not be perfect, but it can certainly work. In fact, please refer to Robert Anderson and Erik Hennum’s article, “Implement a DITA publishing solution without abandoning your current publishing system investments,” in IBM’s DeveloperWorks. They explain how IBM information developers gradually moved to a topic-based architecture using DITA while continuing to use existing information and an existing publishing process.

Sorry to say, but bursting is not a viable solution if your goal is to transform the way you create and deliver information to your customers. Remember the goals of most projects devoted to incorporating XML- and topic-based authoring along with content management—to build efficient processes that result in more flexible and higher quality information to customers. If we are not building customer value, we really don’t have a viable return on a very large investment.