Bill Hackos, PhD
Comtech Services

We all have intuitive views about structure in the world. Indeed, without these views we would not be able to recognize things. But we have to be careful that these superficial views are not confused with the details of structure. Without an understanding of the underlying structure, we can make serious mistakes.

Every three-year old can tell you that a Panda is a bear. When I was growing up I was told by biologists that a Panda may look like a bear but it was in a class by itself, related to a smaller animal living in the same region, the Lesser Panda. Biologists knew they were related because of the similar dark patterns that were present around their eyes. I was told never to refer to a Panda as a “Panda Bear,” because it wasn’t a bear.

With the development of genetics, the understanding of DNA, and the genetic code, biologists discovered that the Panda is really much more closely related to the other bears than to the Lesser Panda. The Panda is truly a bear. Biologists have found many other similar examples. They’ve discovered that the structure of an organism is defined by its genetics. Each individual Panda is a construct determined by its genetics and its environment.

Another example can be found in building construction. Few of us would build a house by copying the structure we found in an existing house. We would be foolish if we did. Most buildings are full of construction mistakes as well as post-construction changes. Additionally, buildings contain the technology of the date they were built. Instead, we go to an architect who designs a house for us. The architectural plans define the house in the same way that genetics define a Panda. We can say that a house is a construct determined by its architectural plans and its environment (construction mistakes).

Let’s take the argument one step further and finally reach the issue of documentation. How do we determine the structure of a document or document type? Many people, including experienced consultants who should know better, look at existing documents (constructs). They get out their highlighters and begin marking them up, looking for all information types—introduction, concept, procedure, note, warning, and so on. The consultants note how these types appear on the page. They tell you “Here’s the structure.” They may even hand you a Document Type Definition (DTD).

Never trust a consultant who arrives with a highlighter!

You have wasted your money. They are really describing a construct of an unknown structure. They may have chosen a book that doesn’t contain all of the information types you need or may be full of construction mistakes. (Didn’t you fire that writer for incompetence?) Maybe this book was constructed before electronic media were considered. Maybe it’s not fit for modern electronic publishing. The consultant may look at other books and do the same, coming up with other structures. Are these books really constructed with different structures, or are the differences environmental (different authors or different content). They have no way of knowing because they are working with constructs of structure, rather than structure itself.

Comtech recommends a better approach. Mother Nature has defined the structure of all living things in terms of the genetic code. Architects define structure in terms of architectural building plans. As information developers, we need to do the same. We need to define the structure we want and then construct our documents based on this structure. Otherwise we will be unable to make any real progress and will be doomed by the outdated documents we are trying to replace. Comtech calls the genetic code for our library of information an “information model.” We must define an information model before we create constructs!

Ultimately, the DTD programmatically defines the information model. There is much to do before we can write the DTD. While you may use a different DTD for a user’s guide than you would for a parts catalog, it’s best to define the DTD elements for the entire library and then build your book DTDs from these elements. That way, if you make a change to an element, that change will be automatically distributed to all of your information. You want to reuse your DTD elements rather than have duplicates spread across your DTDs.

Comtech suggests a process for developing your library information model:

  • Organize a person or team that has the authority and responsibility to maintain the model. Sometimes we call this person or group the information architect or architects. The job of this person or group is to define the structure of information for your entire library. Only they can make changes to the model and the DTD. The person or team should be familiar with the structure of all information already in the library and consider the positive and negative issues with the structure of this existing information.
  • Divide all of the information that will go into your information products into topics by information type. An information type is a module of information that has a uniform structure throughout and is fully understandable in itself. Examples might be a procedure, overview, concept, or reference. Try to keep the number of information types as small as possible. Limit the number of different information types to a dozen or so.
  • Develop a DTD for each information type containing a complete set of rules (elements and attributes) for constructing that information type. Organize elements and attributes into those that are used in multiple DTDs and those that are unique to a single DTD to facilitate reuse. Create a matrix of elements and attributes versus each information type DTD. Build the DTD for each information type from this matrix.
  • Develop a content plan for all of the information products in your library. Create a matrix of information product name versus module name to define where each module is used.
  • Construct (write) the modules, using the appropriate DTD for the information type, keeping in mind which are being used in multiple places and which are used in only one place. Modules used in multiple places must be made general so they make sense in all those places. Modules used only in one place must contain the specifics for the information product.
  • Finally, build all your information products from modules as defined in your content matrix. This can be done by hand or with an appropriate content management system. Conduct all information product maintenance within the modules. Rebuild the information product from the modules for each version or release.

These steps listed above are just an outline. Each of these steps represents a lot of thought and a great amount of work. Little of the original material in the existing documents can be used directly in the new modules because it is not structured and is not written with reuse in mind.

As much as we may wish for the magical consultant who can work wonders with a highlighter, it will never lead to the modular, structured, single-sourced result you want.

You must do most of the work yourself.

Comtech recommends that you go through this process with a non-trivial but simple pilot project. The information gained and the mistakes corrected as a result of this pilot will eventually lead to a successful entry into the world of content management and reuse.

Good luck.