JoAnn Hackos, PhD
CIDM Director

“Let’s go buy something and do what it says.”

That’s exactly what we have heard from desperate publications managers who face a department of argumentative writers who cannot agree on the components of their new Information Model. We know, however, that buying tools is not the solution. Developing an information architecture that defines how you will consistently structure your content is independent of tools.

Consider what you should include in your Information Architecture before you begin to apply the architecture to a specific tool set:

  • Informative definitions of each information type and how your writers should apply it
  • Precise definitions of content units and various XML elements so that your writers apply them consistently
  • Design patterns for your deliverable sets showing how topics are mapped to a structure
  • Metadata labels that you writers will use to categorize and label your content
  • A folder structure for your topics and maps that will organize for easy retrieval the many topics you will create

The DITA model, or any other XML or non-XML model for content, ordinarily defines the tags that may be used to identify content. In an XML or SGML environment, those tags may be semantic, or named in such a way that they identify the content rather than the format of the information. In a non-XML environment, such as we find in desktop publishing tools, the tags generally identify format rather than content.

Information types

As you move to an XML-based Information Model, you need to translate the XML tags into the consistent content you want writers to consider for a topic. A DITA task, for example, suggests that you include a task title, a short description, a statement of context, and a series of steps with added notes, information, choices, and results. This set of XML tags, while somewhat specific as to the content suggested does not explain to your writers the exact content you want in every task.

In an Information Model, define and provide examples for a best practice to use the base task structure. Ask questions such as

  • What grammar do we want for the titles of task topics?
  • What information should be included in a short description or a context section?
  • How are we to write steps in a task? What constitutes a step? What should be included in task information, results, choices, and so on?

As you can see, thinking about the information types immediately gets you into definitions and examples of the required and optional content. In your Information Model, you need to specify those content units and other XML elements with examples of their proper use.

Inline elements

The DITA model includes many inline elements, some specially designed to support user interfaces, software functions, and programming languages. You are likely to find that some but not all of the possible inline elements are needed in your content environment. Read the descriptions of the elements in the DITA Language Specification and look at the examples. Consider how each element might be useful for your content. If you find elements that your writers are unlikely to use and you decide that your writers might be confused by too many choices, we recommend hiding them in the interface of your XML Editor rather than creating specialized topics and removing the elements. At the beginning of any Information Modeling project, it is difficult to know what you may need in the future.

In general, we recommend doing nothing initially. Writers quickly learn the tags they use regularly and ignore the ones they don’t need. In any event, after you find a set of inline elements that are definitely needed for your content, we recommend defining these for your environment and providing several examples. You’ll find that the definitions and examples in the Language Specification are generally not sufficient to explain how to use the elements in your content.

Throughout this Information Modeling activity, you are taking a general specification and making it your own – not through changes in the code, but with definitions, guidance, and examples for your writers.

Design patterns

The same is true for the design patterns you establish for final deliverables. Most organizations have some standards for typical deliverables. You may have a standard high-level structure for product installation manuals or administrative guides. The more uniform your presentation of topics, the easier it will be for your customers to find the information they need. Many organizations are guided by international standards for the parts of operations, installation, and maintenance manuals. We recently reviewed a draft ANSI standard for the pump industry. It outlines the section names for the recommended content.

Remember, however, that the design pattern you use for packaging topics must reflect your understanding of your customer environment. Does the person who needs installation instructions also need repair and maintenance instructions? Are these tasks done by someone else at the customer site? Are the administrative functions performed by the same individuals, or are the tasks partitioned for different audiences? With HTML delivery, you can provide a flexible topic structure for your customers so that they can customize the deliverables. PDFs provide a more inflexible structure that is much more difficult to customize.

As for your writers, continue to remind them that final packaging decisions for a release may be different from the base organization of topics that they used for authoring. Packaging decisions should be made with maximum understanding of the users’ environment. Authoring should be done in smaller contexts in which concepts are linked to tasks and tasks are linked to reference topics. The linking structures for HTML or other electronic delivery becomes more important to standardize than the hierarchical structure inherent in the table of contents.


The fourth set of standards for your Information Model concerns the metadata that you will use to label your topics. Metadata provides you with several advantages:

  • Search for topics in your repository of topics
  • Manage workflow
  • Manage conditional publishing at the topic and the content unit level
  • Support website search for various members of the user community

In determining your metadata needs, you may want to construct scenarios that describe how people internally and externally will most likely look for information. If your scenarios come from interviews with potential users of the information topics, all the better. Trying to decide on metadata in a vacuum usually leads to a poorer model.

It is likely that your content management system will support much of the internal metadata you need to keep track of and move topics through the workflow. It’s worthwhile to discuss administrative and internal metadata requirements with your vendor. Metadata to label the content of your information will not come automatically. Carefully consider how you might want to categorize and label content for different user communities, products or product families, and so on.

Folder structure

Finally, consider the structure you want to use to store your topics in a file server or a content management system. In many ways, the folder structure mirrors your topic metadata. You may want folders for tasks, concepts, and reference topics inside a folder for the subject area, product, product family, or even sub-product component. If you are translating content into multiple languages, you may want a top-level folder structure to be language. In that way, you collect all content specific to each language in separate but parallel folders.

If you have decided to reuse some standard content, such as warranty statements, copyright notices, warnings and cautions, and similar topics, you may place these in a reusable content folder before you create the folders for individual topics. In a content management system, you may want to create virtual folders that guide authoring or are specific to individual writers and their assignments. Because they point to the same topics in the repository, the topics you store in virtual folders only exist in one place in the database.

How much time should information modeling take?

We are often asked about the amount of time you need to devote to Information Modeling development. As with most development efforts, it all depends:

  • How well structured is your content at the beginning?
  • How consistently applied are standard structures throughout your library?
  • Is the structure merely format or do you already have rules about the content to be included in your information set and are these rules expressed at a topic level?
  • Do you have a sound folder structure that can be adapted from book chapter to topics?
  • Do you have a good naming convention for folders and files? Do you need to expand upon them to take individual topics into account?
  • How much time can you and your staff devote to the Information Modeling effort?
  • Are some people available to devote full-time work for several weeks or months?

Information Modeling takes thought and experiment before you have a solution that will work at least initially with your content and your team. Because few organizations can afford to support a full-time activity, we generally find that they take at least a year and sometimes more to complete an Information Model. If you have a good starting point and dedicated staff to the effort, you may be able to complete your work in less time, but you’d be wise to tell the stakeholder that the reality is longer than they might like.

However, you need to complete a comprehensive Information Model to begin a pilot project. One organization decided to focus on install information first, defining the required information types and content units, completing their installation design pattern for deliverables, and developing a complete pilot project with this one area of content. Does that mean that they might have to change something as they expand the Information Model? Most likely – but they have maintained momentum and demonstrated that they can be successful.

We are happy to support Information Modeling projects for organizations. Generally, we provide a “working” workshop and then add coaching as your information modeling team moves forward. Having someone experienced to answer questions and review drafts can make the work move forward more quickly.