JoAnn Hackos, PhD, Comtech Services, Inc.

In an eight article series for the e-newsletter, CIDM will be discussing buying tips for typical services and products that you, as an information developer, might be interested in buying.

This is the fifth in our e-newsletter series of purchasing guides for purchasing information-development products and services. In this issue we look at content management systems.

Content Management Systems

Content Management Systems (CMSs) are databases designed to store and manage content that includes documents of all types, as well as images, audio, and other media. In most cases, various types of CMSs are specifically designed to handle particular types of content better than others. For example, you will find CMSs designed for website content, corporate records, proposal content, training material, archives, and entire comprehensive documents. For this purchasing guide, however, the focus is on CMSs that are specifically designed to assist technical information developers in managing content that supports technical products and services. These CMSs are referred to as “component” content management systems because they are especially designed to handle many small pieces of content effectively. You should be particularly interested in CMSs that are designed to handle structured content that has been developed using XML, SGML, or related content markup systems.

What’s Available

Two types of CMSs are available to handle content during its development:

    • Document management systems are usually designed to manage unstructured content in the form of files, where those files contain text, images, or other content types. Document management systems treat each file as a binary object in a database. Everything that the document management system knows about each file is attached, figuratively, to the outside of the file.


  • Component content management systems also manage content of all types, from pure text to pure images. However, component content management systems that are designed to handle XML or related standards-based structured content are able to operate on the content that is inside each file. In fact, a component management system may be able to handle each element inside a file as a separate object in the database.

Document management systems excel when you want to keep track of a document as it moves through the development cycle as a whole. A document management system will enable you to track the many versions of a document under development. A component content management system allows you to manage all the small pieces of an XML-based system like DITA most effectively.

Features that count

Document management systems and component content management systems share many of the same basic functionality. They differ in the ability to handle the management of XML content and small content chunks. Among the component management systems, you will find differences in functionality and design that may impact your selection process.

Standard features

Version control – Both document- and component-centric systems provide rich version control during the development process. Version control means that as you modify a file, the history of all the changes to the file are maintained in the database. You know precisely what changes, when it changed, and who changed it. In addition, you can elect to revert to a previous version of a file if subsequent changes are no longer applicable.

Security – Both systems provide security to ensure that only one person is able to modify a file at one time. However, because component management systems allow you to work on individual XML elements, more than one person may work on the same topic although on different elements within the topic. For example, I might be revising this paragraph on security while my colleague is revising the paragraph on version control. We would each know what element the other has checked out for revision.

Permission levels – Content management systems allow you to set permissions for each person who needs to interact with the system. Some users may be allowed only to read a topic or comment on it through a review mechanism. Others may have permission to create new topics and modify existing topics, depending upon their project assignments.

Metadata – Content management systems manage stored content through rich metadata associated with that content. For example, you may have a topic that is identified by its owner, author, reviewer, language, file type, dates of each change, and so on. The topic may also be identified by brand, product name, product number, information type, and any other set of categories assigned by authors or assigned by the system itself.

Search – By offering full-text search as well as metadata-based search, content management systems enable users to find the topics they need from the repository without needing to know in which folders they are stored.

Workflow – Content management systems support multiple levels of workflow so that you can move a topic through the information-development life cycle. Workflow allows you to revise a topic and send it to editors and reviewers. If staff have access to the content management system, they may receive a notification in a workspace or through an email. If staff are outside the system, they may receive an attachment in an email or have access to content in a web browser. In either case, the notifications and processes are designed into the workflow and occur automatically, saving everyone time and helping to ensure that schedules are followed and deadlines are met.

Component content management systems

Document management systems focus on managing relatively large chunks of content and excel at storage, retrieval, and control of those chunks. Component content management systems focus on managing relatively small chunks of content and ensure that they can easily be used in more than one output.

Component content management systems support

    • reuse of components


    • assembly of components into larger deliverables


    • conditional publishing of topics that contain content for more than one deliverable


    • managing of multiple links between topics


    • management of translation at the component level


    • integration with XML authoring tools


  • underlying database structure

As a consequence of these features, component content management systems efficiently and effectively support XML-based authoring of structured content such as content developed in DITA or DocBook.

Component reuse

One of the primary goals of technical publications professionals in using a CMS is to facilitate the reuse of content in more than one deliverable. Topics are created that can be used in multiple user guides to support variations among individual products and services. Topics can be delivered in multiple media, including print, PDF, HTML, Eclipse, and various help systems.

Individual parts of topics can themselves be reused in the appropriate contexts. For example, a safety warning can be written and stored once in the CMS and used in every topic that requires a warning. A copyright statement or other legal notices can be created and stored once and used in every applicable document or website. Images are included in topics by reference so that they can be updated independently of the text. Component reuse is critical to the effective management of complex technical content.

If a CMS is managed carefully, only one instance of each piece of content will exist in the repository. In addition, a component CMS allows authors to know exactly where each piece of content is used by various outputs or content assemblies, ensuring that revisions remain applicable in multiple contexts.

Content assembly

A component-based CMS supports the assembly of topics into larger deliverables. For example, a CMS that supports DITA maps allows authors to create books, chapters, or sections by referencing individual topics and placing them in a hierarchical structure, much like a table of contents. Topics may be referenced in multiple maps, facilitating reuse at the topic level.

Individual topics may be assembled into constructs that themselves support conditional processing through the assignment of topic-level metadata. For example, more than one PDF or HTML collection may be output from a single assembly, depending on the assignment of metadata to the topic in the DITA map. Content assembly is an integral part of the DITA information model through the DITA map function. However, CMSs may also support content assembly through integration with desktop publishing systems like FrameMaker that support the assembly of chapters into books or through the DocBook information model that is also designed to output books.

Some document management systems support functionality often referred to as virtual document assembly. Independent files can be assembled into larger documents through these virtual systems, allowing for integrated page numbering and tables of contents. On the other hand, most document management systems do not support content assembly as part of the information-development life cycle because their focus is on storing and managing content as individual entities rather than reusable parts. Component CMSs instead focus on the combination of independent parts into larger contexts for publishing and dissemination.

Component assembly is further supported by automated publishing. Because XML components contain no formatting information, they can be output by attaching appropriate formats at the end of the information-development life cycle. Literally a push-the-button process, XML files are assembled into the final outputs and style sheets are attached. The style sheets may support print, PDF, HTML, Eclipse, and various help systems. Different styles may be used for different desired outputs with no changes to the topics themselves. Automated publishing is usually integrated into the CMS functionality, including publishing in multiple languages. Automated publishing significantly reduces the time and cost of developing multiple outputs from the same sources of content.

Conditional publishing

Component CMSs are especially tuned to support conditional publishing. Conditional publishing means applying a set of parameters to content so that outputs can be tailored to the needs of specific customers. For example, a company may produce a suite of products that have very similar functionality. Each requires its own set of information. However, without conditional publishing, the company must duplicate content from one documentation set to another, often through cut and paste and even through writing and translation of the same or similar content multiple times. With conditional publishing, one set of content is developed. The important differences are tagged with metadata or a similar proprietary tagging system. Once the content is ready to be published, the individual conditions are set. As a result, the output is specific for each product variation with some content common to all and other content specific to the variant.

Although conditional publishing has been an attribute of desktop publishing systems for some time, component CMSs provide direct support through the use of metadata. For example, information developers can build so-called master topics that contain content variations for different products, media, customers, and so on. The content variations are marked by metadata. Then, using processing rules, unique versions of the topics are output for different deliverables. CMSs that support DITA use the Ditaval processing rules to designate the metadata selections to be included or excluded from a particular deliverable.

Conditional publishing is further supported through text entities or content references. Files of highly changeable content are created separately from the context in which they will occur. For example, an information developer may decide to set a variable in the text of a product name because the name is likely to change before final publishing. A list of possible product names is maintained separately. Only upon processing, is the product name substituted for the variable placeholder in the text. Controlling which variable is substituted can also be handled with metadata.

Component linking

Technical publications have traditionally maintained links between relevant topics of information. Links are established between entries in a table of contents or an index to the relevant items in the document. Links are established between information in one document and others in the suite. Links are maintained between information in a document and other sources of information outside the organization through references to specific web pages.

Content management systems that are most useful for building a repository of technical content must be able to manage multiple links. Linking works best, however, when the CMS provides a unique identifier for every component it stores rather than a relative path to that component based on folder structures. We have all experienced the perennial broken link when the referenced content is moved from its expected location. With unique IDs controlling the linking process, the links should not be breakable. Because the content exists in a database rather than in a file server system, moving the content from one “virtual folder” to another should not affect its findability.

Component CMSs are expected to provide sound component linking that ensures that links are not broken. This capability even extends into disallowing a component to be deleted from a repository if it is linked to another component in the repository. If a deletion is in fact necessary, the information developers owning each use of the component must negotiate its removal.

Translation management

One of the most important cost benefits to content management is the reduction of translation costs and the decrease in the time required to complete translations. Component CMSs are generally capable of managing the translation process. Because content is stored and retrieved in small components rather than in entire large documents, translations of components can begin as soon as they are ready, without waiting for an entire document or suite of documents to be finished. Translation service providers who have access to the repository of topics, often through a web-base interface, can be notified of topics that are ready for translation as early as possible.

The translated topics can then be returned to the CMS in parallel to the topics in the source language. Parallel sets of topics in parallel sets of folders ensure that changes to topics in the source language are mirrored in the target languages. In most translation-capable CMSs, topics that are new or have been changed can be packaged and delivered to a localization service provider as soon as they are ready. Only the new or changed topics need to be translated; existing and unchanged topics are never translated again.

Many component CMS vendors have established partnerships with translation management systems so that the flow of content from one system to the other is well supported and seamless. And, if production publishing is handled through DITA maps or similar mechanisms, the translated topics can be published using the same automated processes as the source language, reducing the cost for desktop publishing by the localization service provider.

Integration with XML authoring tools

Component CMSs generally provide bridges to a variety of XML authoring tools. When a bridge is in place, information developers find it easy to open a topic in the repository in the particular authoring tools. Simply selecting and opening the topic opens the integrated authoring tool. Similarly, information developers working in an authoring tool are able to find and retrieve a topic, check it out, and check it in again through a menu in their authoring tool that leads them directly to the content management repository.

Integrations with some authoring tools provide additional useful functionality. Developers can create DITA maps in some authoring tools and create named “shell” topics. The shell topics have nothing more than a file name and a title in place. However, they are part of one or more DITA maps and placed for convenience into the appropriate folder structure. They are ready to be assigned to information developers to complete the content. The topic itself already has a place in the repository.

If a particular CMS does not provide a bridge to a favorite authoring tool, that bridge may be built to support a contract with your organization.

Underlying databases

Component CMSs are built on many types of databases. These databases are the basic storage containers for the topics, images, and other components. However, the CMS provides all the functionality to manage the ins and outs of the content and the database. Many component CMSs are built on a particular relational database product, such as the Oracle or the Microsoft database. These CMSs have optimized their functionality to best use the model provided by the selected database. A few CMSs operate on more than one relational database, providing you with a choice.

Another set of CMSs operates on XML or object-oriented databases. These databases may be entirely bundled into the CMS, avoiding many database administration tasks. XML database developers claim that they provide faster search and retrieval than relational databases. However, the differences may not be visible to the information developers. The choice of a CMS may be best based on the range of supported functionality rather than on database type.

Host versus Purchased CMSs

In the past year, we have seen the advent of hosted component CMSs. Hosted systems are managed and maintained by the vendor rather than by internal IT staff. Your repository of content is generally placed on a secure and independent server and database. All maintenance and upgrades are performed by the vendor. Some hosted systems allow you to use part of the monthly hosting fees toward purchase of the system.

Hosted systems reduce the internal administrative costs of deploying a CMS, especially if internal IT staff is reluctant to take on yet another database and system to administer. Most hosted systems provide a browser-based connection to the repository, making it easier for individuals to access the system from remote locations.

Software as a Service

Software as a Service (SaaS) makes a CMS available as a rental, payable monthly or quarterly. SaaS provides a common applications architecture for all users, even those with different focuses of interest (i.e., group, department, or enterprise).

Hosting usually implies that the customer has purchased software licenses but chose to host the applications on a server maintained by the vendor. Hosted solutions typically are not configured in exactly the same way. As custom solutions, they become more expensive to maintain over time.

How to choose

If you decide that a content management system is an important component of your organization’s information management challenges, consider how to choose among the various systems available.

Create a business case

The first step in acquiring a CMS is management support. CMSs are a significant investment for an information-development organization, particularly since acquiring enterprise-style software is not the norm. Most previous purchases have been for simple desktop publishing applications at less than $1K per person. A CMS is a much larger investment.

A business case should demonstrate that a CMS deployment will save costs and increase efficiency as well as provide a more valuable asset to deliver to customers. It should show exactly how you hope to achieve cost savings and how long it will take before the savings will pay back the cost of a system and its implementation. One chicken-and-egg problem, of course, is to predict savings when you don’t know yet the cost of a CMS to meet your requirements. However, you might be able to create a rough estimate to be updated later. Think in terms of the cost of personnel. The fully loaded cost of one staff member is somewhere between $100 and $150K per year in North America. Many CMSs will cost about two person years to purchase and fully implement, including the costs of your own team members in redesigning your content to enable reuse, conditional publishing, and automated assembly and production.

Work closely with your financial officer to understand how best to make the cost/benefit case for content management in your organization. Assemble stories about the costs of duplicate content everywhere and duplicate translations. Explain how the new processes will have benefits far beyond your own organization.

Attend presentations, webinars, and other sources of information

Find out as much as you can about CMSs by attending conferences, workshops, and presentations. Sign up for one of the free vendor webinars, especially if they are co-presented by an industry leader or organizational manager who has implemented the system. Listen to recordings of previous webinars. Find reviews of CMSs online. Talk to colleagues in your area or online through one of the specialized listservs that address content management.

Be certain that you are absolutely clear about the differences between the component CMSs that will benefit your information-development life cycle and the enterprise document management systems, web content management systems, and all the various flavors of content management that your IT department may know about. If your company has some existing CMS, find out what it is used to support. A system that delivers web content or manages document blobs will not provide you with the functionality you need, not without a lot of expensive customization.

Research the component content management systems available

Learn about the component CMSs out there. Many of the vendors are members of the CIDM. They are an excellent starting point. You’ll find links to their websites at They also often provide us with non-commercial articles about challenges and successes in their areas of expertise.

If you contact each of the CIDM members directly, you’ll find a knowledgeable (usually soft sell) representative who will be happy to provide you with lots of information, including the names of existing customers to contact.

Make a checklist for yourself to keep track of the features and functions. As soon as you look at more than two products, the details begin to run together. Use the list provided in the article to note the standard functions and the extra ones.

Definitely ask about support for XML and DITA. Try to discover exactly how DITA is supported and how well the product conforms to the standard. Note what extras the vendor has added to basic DITA functionality and be certain that the additions are portable. If you someday want to move to another CMS, can you take the extras with you?

Develop your detailed requirements

As you research the systems, you should, of course, begin to develop your own list of requirements. Find out if colleagues in other organizations have developed similar requirements. They might provide a good starting point. Read about requirements development for CMSs in Content Management for Dynamic Web Delivery (Wiley 2006). In the appendix, I provide a detailed list of questions to ask yourselves about requirements.

Develop scenarios or use cases that detail what you expect the system to do for you. Try to avoid defining the technical capabilities; focus on what you need from the system without predetermining a particular solution. One organization had a complex versioning requirement. Only by writing out a step-by-step description of the versioning process did the requirement become clear. Send the scenarios along with your list of requirements. The scenarios provide rich detail that is unavailable in a list.

After you have defined your requirements, ask two or three CMS developers to respond. Ask them to tell you if the function is available out-of-the-box, must be configured, or must be customized. Ask if there are additional costs for configuration or customization. Both activities can add substantially to your overall costs. Remember that customizations done initially may have to be done again when the system is upgraded.

Provide samples of your content to use in demonstrations of product capabilities. Ask that the vendors demonstrate how they will handle the scenarios that you have included in the request for proposal.

Ask about compliance with standards

Ensure that the CMS product conforms with OASIS DITA standards. Ask if the vendor supports the DITA Open Toolkit as a basic requirement. Find out what XML authoring tools are integrated already with the CMS. Ask if standard DITA metadata is supported and how exactly it is handled by the CMS. If metadata is included with the repository, what happens to it when you export a topic out of the repository? The best process attaches the metadata to the topic as you have defined.

Ask for a proposal

After you are familiar with the vendors and their credentials, ask for a complete proposal. Be certain that you understand thoroughly the time and costs required to implement the system. If you are looking for a low-cost, out-of-the-box solution, remember that it will not take your special needs into account.

As part of the proposal, ask for a presentation that shows how the CMS will work in your environment with your content. Be certain to meet with the people who will work with you on implementation. Sometimes it’s the personal relationships and comfort-level that are more important than the technical functionality.

Ask for references

Ask for the names of managers and other individuals in companies that have implemented the CMS. If at all possible, conduct one or two visits to the companies. Actually seeing a system in place is invaluable, providing a more in-depth understanding of the benefits and challenges than phone calls will ever do.

Test drive the system

Inquire if it is possible to use a basic version of the product for a Proof of Concept (POC) project. You may not get all the bells and whistles in the POC but your team members will get a better idea of how the system actually works. Remember that everything seems strange and difficult at first. Give your staff time to adjust to the new concepts. Once you’ve made a purchase decision, use the system right away in a pilot project. Don’t let it sit on a shelf, even if you’re not 100 percent sure of your new Information Model. Produce some immediate results that you can brag about and share with the senior managers who gave you the financial support you needed.

Communicate and communicate again about the advantages of the new system. Be honest about the challenges. Everything new takes getting used to and anything worthwhile requires an investment in time and effort as well as money. Content management provides you with a tremendous asset and will pay for itself with additional benefits, but only if you make the right decision.