Managing the Impact of Change on Your Content

CIDM

February 2003


Managing the Impact of Change on Your Content


CIDMIconNewsletter Mark Baker, Senior Consultant, Content, OmniMark Technologies Corporation

Most content-management systems work very well when they are first installed. Everything seems so much easier to find and to manage. Over time, however, the performance of many content-management solutions degrades and content becomes harder to manage and to keep current. Why does this happen?

The main reason that things go so well in the early days of a new content-management installation actually has less to do with the system than with the process that surrounds its installation. Quite simply, the occasion of moving to content management is usually the cause for a huge cleanup and reorganization of your content. At the moment your content-management system goes live, your content has never been cleaner, better organized, or more up-to-date. No wonder everything seems to go so well at first.

Unfortunately, your content may never be as clean and well organized again. Your content is subject to constant change, driven both by changes in your subject matter, such as new features or new versions of your product, and by changes in requirements, such as a move to new media or a requirement to provide individualized documentation. Unless your content-management system is designed and configured to enable you to manage the impact of these changes on your content, your content will become steadily less clean and less well organized as time goes by. And as a consequence, your content will become harder to manage.

Content-management systems tend to move you away from independent authors working on separate projects and toward multiple authors contributing to a common set of content. In a collaborative environment, authors are expected to link to, and reuse, content created by others. But as the content set grows, knowing what content already exists becomes more difficult. As it becomes harder to find content, authors create more duplicate content and more fragile links, which adds to the disorganization of the content. As your content becomes more disorganized, finding the content that is affected by a particular change and ensuring that all affected content is updated can become more and more difficult.

Once disorganization begins, a vicious cycle starts and disorganization becomes steadily worse. The cost of authoring starts to go up, while the efficiency and reliability of the system start to go down. The slowdowns can impact from one end of the system to the other. Authors work more slowly because of the complexity of the environment in which they work and the unpredictable effects of the changes they make.

Failure to manage the impact of change on content is often invisible in the first weeks or months that a system is in operation. The impact of this failure is sometimes encountered when the first big product change occurs, or it appears over time as the volume of content accumulates and its complexity grows. By the time it becomes clear that the problem is becoming unmanageable, the cost of repairing the situation can be overwhelming because of the volume of content and the number of systems and processes impacted. If possible, therefore, it pays to consider the management of the impact of change right from the beginning of your content-management project. If you have already invested in content management and are beginning to encounter problems, now is the time to better manage the impact of change on your content.

Managing the Impact of Change

Managing the impact of change on your content means changing the way you organize your content so that the impact of change can be minimized. You can use a number of techniques to make content more resilient to the impact of change, but they all can be summed up in one idea: you must manage your content on the “content axis.”

In a content-management system, you can organize content in two fundamental ways: on the content axis or on the document axis. On the document axis, content is organized as documents or fragments of documents. Most content today is managed on the document axis. Documents are collections of content organized for the purpose of delivering information to a particular audience. Managing content on the document axis is a natural way for authors to develop content. Unfortunately, then the impact of change is strongly felt across the document axis. A single change in subject matter or requirements can impact many different documents. For instance, a change in a procedure can affect reference guides, user guides, and reference cards; a new feature can affect six different product lines; a change in the name of a single button can affect hundreds of procedures. Thus, content on the document axis is vulnerable to the impact of change.

The alternative is to manage your content on the content axis. On the content axis, content is organized according to its properties as content, independent of how it will be organized and presented to the reader. Usually, though not always, organizing on the content axis means that content is organized around the real world objects the content describes. The advantage of organizing content on the content axis is that the impact of change is usually felt along the content axis only, which greatly reduces the amount of churn caused by change and makes implementing changes easier.

Managing the Impact of Subject-Matter Changes

Here is an example of how a piece of content might be managed on the document axis and on the content axis. The content in this example is the description of a function, “omtcp-accept-connection,” in a programming language.1

Figure 1 shows what a topic on the “omtcp-accept-connection” function from the “OMTCP” library might look like using typical topic-based authoring techniques (with a lot of detail removed). Figure 2 shows how this topic might be represented as an XML document. This topic contains all the information about the “omtcp-accept-connection” function that is important to the reader. However, a lot of the information contained here is, in fact, information about the “OMTCP” library, not information about the “omtcp-accept-connection” function. This information occurs identically in the description of every function that is part of the “OMTCP” library, including the information under the following headings: “Library,” “Include file,” and “Usage note.” Similarly, the information under the heading “Available in” is really information about the product lines2 that include the “OMTCP” library.

Figure1

Figure 1: A Topic Using Topic-Based Authoring Techniques

Figure2

Figure 2: A Topic Represented As an XML Document

Topic-based authoring involves creating individual topics that contain everything readers need to know about a single subject. But to give the readers the information they need about a single subject usually requires that you include information on one or more related subjects as well. This related information can occur in many different topics. Topic-based authoring, therefore, does not eliminate redundant information, and a change in subject matter or requirements can create the need to update many different topics. As long as your content is maintained on the document axis, this redundancy is inevitable.

Maintaining information about the library content in every function topic creates a significant problem for the author. Suppose that product management decides that the OMTCP library should only be in the Enterprise product, or research and development decides to change the name of the library to OMNET and the include file to “omnet.xin.” These changes would hit every function description for the entire function library. Every instance would have to be found and changed. The change in information is small, but the impact of that change on the documentation set is significant.

To avoid this problem, we can move the library and function information to the content axis. On the content axis, the library information and the function information would be kept in separate information objects3 and would be merged only when it is time to create a document. All the information we have about the library would be kept in a “library” information object. All the information we have about each function would be kept in a “function” information object. All the information we have about product lines would be kept in a “product line” information object. When a change occurs to a library, only the library object will need to be updated. When a change occurs to a product line, only the product line object will need to be updated.

Figure 3 shows what the function information object might look like on the content axis, expressed as an XML document. Figure 4 shows what a library information object might look like (both greatly abbreviated). Notice that the library information object includes information about the library as a whole, which does not occur in the “omtcp-accept-connection” function topic. The library information object contains all the information we maintain about the library. That information may be used to build other topic types, such as a library topic. Also, notice that the library information object does not contain a list and description of all the functions that the library contains. This list would certainly be a useful part of a library topic, but the information already exists in the function information objects, and we can draw that information from those objects when we create a library topic.

Figure3

Figure 3: The Function Information Object Expressed As an XML Document

Figure4

Figure 4: The Library Information Object Expressed As an XML Document

With the content organized on the content axis, any change in the library information requires only an update to the library information object. You will not need to update every function topic because the library information is no longer maintained in the function topics. Because the OMTCP library contains 20 functions, organizing content on the content axis reduces the number of topics impacted by a change in the library information from 21 to 1, a better than 95 percent reduction in effort. Also, the system will generate 1 workflow instead of 21 and require 1 approval rather than 21.

Building Topics from Information Objects

With information managed on the content axis, managing and updating content becomes much easier. If a change occurs to a library, all the information about that library is in one place and all the necessary changes can be made in the single library topic. But to deliver information to the reader in useful form, we need to move the content back to the document axis and create the function topic in its original form by pulling together information from several different sources.

First, we need the information from the function object. Then, we need selected pieces of information from the associated library object. The link between the library object and the function object is maintained at the database or metadata level, so a database or metadata query is needed to find the right library object.

Next, we need information on which product lines include this library. Another query pulls this information from a database or metadata repository.

Fourth, we need the text of the usage note, which is not part of any of the information objects. The text of the usage note is common to all function topics for all libraries. It is part of how we present information on a function. As such, it forms part of the template that defines how a function topic is organized and presented. Only the name of each library file changes from one library to another, and we can get the name from the library information object.

Finally, we need a definition of which pieces of information from the function, library, and product line information object are required and in what order they are to be presented. The function topic template will help with this task. The template defines what information is required from the various objects and where it is to be placed. Figure 5 shows what a template might look like.4

Figure5

Figure 5: The Function Topic Template

The process of pulling all these pieces of information together is called synthesis. Synthesis is performed by a synthesis script. The synthesis script for a document or document set pulls together all the necessary templates and uses the information in the template to select the appropriate elements from the information objects. It makes the necessary connections, translates the XML encoding, and writes out an XML version of the topic, like the one shown in Figure 2.5

Once we have a function topic as the output of the synthesis process, we can send it on to a formatting process that will format it appropriately for print, Web, help, or other media, resulting in something that looks like Figure 1.

Managing the Impact of Requirements Changes

In the previous example, the function and library topics include a list of the product lines that the library is available in. This list implies that only one set of documents is being delivered for all the different product lines, with the availability of different libraries being noted in each library and function topic.

Suppose that product management decided that it wanted to ship a different documentation set with each product line. Each documentation set would include only those libraries that are available in that product line. We would need to build three different documentation sets, and we wouldn’t want any of the function or library topics in any of these documentation sets to include the product line information. If our content is managed on the document axis, the authors have a lot of work to do. They have to go through every function and library topic, which adds up to several hundred topics, and remove the product line information. Then, they have to build three separate documentation sets by selecting the appropriate topics to include.

However, if the content were managed on the content axis, this change in requirements would not require any editing of information objects. The only changes required would be to the synthesis templates and scripts. The list of product lines would be removed from the synthesis template so that information would no longer be included when the topics were built. The scripts would be adapted to build three different documentation sets. No human input would be required to select the appropriate components because the information relating to which libraries are in which products is already part of the content set and the scripts can use this information to build the appropriate documentation for each product line. (And if product management changes its mind next time and wants it back to the old way, all we need to do is switch back to the old templates and scripts.)

The Granularity Issue

In implementing single-sourcing systems, people often worry a great deal about the issue of granularity. It is important to note that moving content to the content axis does not involve creating an extraordinary level of granularity. In fact, on the content axis, bringing together all the information on a particular real world object is important to ensure that when a change occurs, updating information on that object is as easy as possible. However, when we use an information object to synthesize a topic, we may use only selected pieces of that object. Thus, we do not have to manage a high degree of granularity at the information object level, but we do require a well-defined structure within objects to allow selective use of the information they contain. You may be struggling with a granularity problem as a result of trying to manage very granular pieces of documents on the document axis. Moving your content to the content axis could well help solve your granularity problems.

Content-Management Systems and Managing the Impact of Change

How well does your content-management system help you manage the impact of change? In approaching this question, it is important to realize that the change-management problem has two distinct aspects: managing the process of change and managing the impact of change. Managing the process of change means managing the entire process by which a piece of content gets changed: assigning the change to an author; reviewing and editing the changed material; and approving and publishing the changed content. Managing the impact of change means organizing and managing your content in such a way that when changes happen in the world, the impact on your content is limited, predictable, controlled, and easy to implement.

Managing the process of change is well understood today. Most content-management systems come with excellent facilities for managing the process of change. These facilities include access control, workflow management, and version control systems. On the other hand, content-management systems do little or nothing out-of-the-box to help you manage the impact of change. For that, you have some more work to do.

Note that the sophisticated management of the process of change is often required precisely because of the difficulty of managing the impact of change when content is managed on the document axis. Moving your content to the content axis will reduce the number and complexity of the changes you need to make to your information set, making it much easier to manage the process of change. Investing in the management of the impact of change may well reduce the investment you need to make in managing the process of change.

Conclusion

The ability to manage the process of change is an important benefit of using a content-management system. But unless you design your content and your process to also manage the impact of change on your content, you will discover that you have done little more than orchestrate drudgery. To really gain control of your content in an ever-changing world, you must begin to manage the impact of change on your content.

Designing your content-management system to manage the impact of change has several benefits:

  • It can greatly reduce the amount of work required to respond to changes in the real world that impact your content. A reduction of effort of 95 percent or better is achievable for certain types of content.
  • It can help you to improve consistency and quality. Conventional reuse strategies on the document axis do not come close to uniting all references to a particular object in a single place. Tracing the impact of a change in the object is still a significant task, subject to error and omission. Moving your content to the content axis greatly improves the consistency with which changes are implemented and also greatly increases the potential for effective reuse of content.
  • It allows you to improve the quality of your information by efficiently implementing changes that were not even possible before. Authors can more easily and effectively respond to changes in the world and to changes in requirements for your information. Authors can more easily pursue new markets and enhance customer value for your information and your products.
  • It greatly expands the range of deliverables you are able to produce and especially your ability to respond to unexpected new requirements. Because your content is managed on the content axis, not the document axis, it does not have a built-in bias toward one particular type of document. Documents are created through a synthesis process. If a new type of document is required, it can be delivered by implementing a new synthesis process. It allows you to improve the ease of use of your system for your authors. After an initial period of getting used to working on the content axis, authors will find that implementing changes in content is far easier, increasing their productivity and releasing them from low value drudgework like tracking down and changing tens or hundreds of instances of the same fact. CIDMIconNewsletter

About the Author

February BP4

Footnotes

1 A programming language generally consists of a small core of functionality expressed by the keywords of the language and a larger set of functionality packaged as a collection of functions. Functions are organized into groups called “libraries” according to the type of functionality they provide, such as accessing a database or communicating over a network using the TCP/IP protocol, as in the example used here.

2 Three different product lines are available: Standard, Professional, and Enterprise, but the OMTCP library, and the functions it contains, are available only in the Professional and Enterprise version of the product.

3 I use the term “information object” to describe the units of information managed on the content axis as opposed to the “topics” or “documents” managed on the document axis.

4 Don’t worry about the specifics of how the template language works. The example comes from a working system; however, the implementation details can vary widely. The important point here is to understand how moving your content to the content axis can make managing the impact of change in your content easier.

5 If you are into the detail, note that the template actually calls another template to create the “available-in” list.