Best Practices for XML and Structured Documentation

Home/Publications/Best Practices Newsletter/2007 – Best Practices Newsletter/Best Practices for XML and Structured Documentation


April 2007

Best Practices for XML and Structured

CIDMIconNewsletter Troy Klukewich, Independent Consultant

XML, structured documentation, and single-source strategies are on the rise in documentation departments, which is no surprise given the ever-increasing complexity of documentation projects. For the last six years, I have been involved continuously with XML-based documentation projects, from a small prototype for a few hundred business objects to a company-wide initiative that involved tens of thousands of topics and impacted numerous products.

What are some of the principles and practices I’ve picked up along the way? In this article, I list the more prevalent problems I’ve faced, their solutions, and a number of best practices for XML implementations. Though my area of expertise is enterprise software, many of the principles will apply to other fields. For larger projects, I used a very distinct three-tier concept, procedure, reference model for topic structure, but the principles in this article are applicable to any XML solution, whether DITA, DocBook, or home-grown XML.

To identify best practices for XML and structured documentation implementations, let’s first set a prototypical business context. Let’s take a mid-sized software company, a company that started small with a great idea, grew quickly, and acquired numerous other companies along the way while expanding into international markets. Each company acquisition comes with its own documentation group, content formats, and production methodologies. The parent company combines product offerings with various degrees of integration and cohesion with the ultimate, if unrealized goal of providing customers with a consistent look-and-feel.

What are the typical documentation problems we’re trying to solve in our growing, increasingly complex business scenario? The problems I have seen come down to a number of impediments to performance that involve scale, core competency, flexibility, and localization overhead. Let’s take each point in turn.

Impediments to Performance

  • Inability to Scale
    What works well for a single product in a startup may not work so well for a much larger, mature product, or for multiple products in a suite. Small problems that require manual fixes every release become large problems with increased scale. What should take only minutes to produce, like a PDF, can end up taking additional weeks due to a vast legacy of manual fixes across thousands of topics.
  • Reduced Focus on Core Competency
    What is the core competency of the technical writer? I am with the camp that says that the writer’s core competency is writing, not layout, tweaking presentation, or troubleshooting help build problems. Writers should spend the vast majority of their time in content development. I have found that as desktop publishing projects get ever larger, the percentage of time spent fixing formatting increases. For example, I worked on a large, complex project where writers spent up to thirty percent of their time resolving formatting issues. Some system overhead is natural, but I would rather see this number under five percent.
  • Lack of Flexibility
    Working with source formatting geared to a particular output can generate unforeseen problems if your department suddenly needs to support a new output or help engine. I’ve seen scripts designed specifically to fix output problems in order to support a new presentation, only to require additional scripts to fix the new outputs as yet another presentation is added later. In time, the amount of spaghetti code supporting existing formats can itself become an impediment to accepting a new output. On the other hand, changing sources and training writers every time a new output is required is expensive as well.
  • Increased Localization Costs
    Every manual fix to formatting must also be included for every localized language. Localization can be delayed for weeks or even months with the additional overhead of fine-tuning formatting to match the English originals. Considering that upwards of eighty percent of the product localization budget is due to content, the extra expense adds up quickly and can easily exceed the cost of an entire documentation department given enough languages.

General Examples for Change

The following general principles solved my legacy documentation problems.

  • Separate Content from Presentation
    Separating content from presentation is a fundamental principle of using XML for structured documentation. Once content is separated from presentation, the presentation can be automated and writers can focus on content. The expense of supporting a new output is minimal with the quick addition of a new output transform. Changing or adding a help engine does not require that writers change their formats or fix additional formatting problems.
  • Mechanize the Mechanical
    XML with transforms for outputs is inherently batch-oriented and suited to automation. If some action is mechanical in nature, it can likely be automated with XSLT and scripting languages instead of having writers perform the same work. There are many kinds of automation that can be added, from catching common writing errors to synchronizing link text.
  • Focus on Core Competency
    Once presentation is automated, writers can focus on content, freeing up a significant amount of time for content. I have found that upwards of twenty to thirty percent of writers’ time in a complex legacy content system was taken up with grunt work that could be automated with XML. This is a huge increase in content production.
  • Centralize Documentation Builds
    Many companies have multiple product groups, often from previous acquisitions, each with their own help build system. In some cases, multiple help builds produce essentially the same kind of output. For the sake of company consistency and ease of use for customers, it makes sense not only to centralize presentation for a common look and feel, but also centralize the help build system itself to produce all required outputs from the same XML source.

New Roles and Practices

In addition to these general principles, I have found a number of roles and practices that support a successful XML implementation.

  • Documentation Architect
    XML is inherently designed to evolve. Someone needs to play gatekeeper for evolving standards as implemented in XML schemas or DTDs. Identify a single point of contact representing documentation architecture as a whole. This responsibility could be a management function or a formal documentation architect role with management oversight.
  • Documentation Architecture Team
    Form a Documentation Architecture committee of senior technical writers from each product group or company division to field issues of consistency and promote suggestions. The designated company Documentation Architect should head up this team and a management representative (if separate from architecture) should attend. If there is a formal Documentation Architect role on a career ladder, make participation in the committee mandatory. This team should be involved with product engineer architects to ensure that the documentation system evolves along with product direction.
  • Editing Team
    Ensure that the editing team is completely up-to-date with all standards implemented in the schemas. For instance, if the design of the procedural topics includes a summary element with a single paragraph restriction, editors need to understand how the summary element may appear in different contexts outside of the topic: typically in an automated topic listing where multiple paragraphs would not make sense. Should writers try to write around the restriction, perhaps using additional paragraphs outside the summary element, editors will be able to effectively communicate policy and goals. To better align writers and editors, it’s best to involve senior writers with the editing team to garner peer buy-in for the many changes that structured documentation and XML bring to the documentation group.
  • Style Guide Committee
    Form a separate committee to review the low-level stylistic issues with the participation of the editing team. In general it is a good idea to have writers involved in the development of stylistic conventions and to help melt down the “us versus them” mentality that some writers have towards editors. If writers and editors work together to develop standards, writers are more likely to understand and accept direction. Many stylistic conventions can be encoded in schemas and checked with automation scripts (though not all), so it is important to make expectations explicit. This team should not be confused with the architecture team, which oversees the overall design of product deliverables and documentation types.
  • Style Guide
    Update the Style Guide to represent stylistic conventions and ensure that writers have up-to-date training on all major conventions. One major difference with an XML-based structured system is that topics are explicitly validated. In other words, the design of the documents is encoded into the system. Writers need somewhere to review what the standards are. In a previous company, there were so many changes to stylistic conventions based on structured standards that we had literally to throw out the old style guide based on legacy FrameMaker conventions and start over from scratch.
  • Suggestion Backlog
    Track, prioritize, and plan XML infrastructure suggestions and requests from a central backlog. Once writers start understanding the flexibility and power of XML with automation, they often come up with many good ideas, not all of which can be implemented at the same time. On the other hand, many suggestions may come up that are really a throwback to the previous way of doing things that are not really appropriate or optimal for XML. Ideas need to be evaluated, prioritized, and appropriately scheduled.
  • Management Review
    All documentation managers using the centralized XML system should be involved in its direction. This group, along with the documentation architect and a representative from the documentation services team, should review and approve changes on a regular basis depending on the number of writers and suggestions.
  • Documentation Engineers
    Identify who does the automation and schema work. In some groups, technically oriented writers maintain the help build system. In others, a product integration engineer does so. In an ideal world, and for a company of sufficient size, I have found it strategic to have dedicated Documentation Services engineers entrusted to the documentation builds. Because XML transformations and help builds are inherently centralized, you do not want to end up with various random writers making ad-hoc changes to either the central build or copying out builds for their own ends. Also, documentation engineers quickly develop knowledge and expertise in documentation issues with programmatic solutions. This group, along with the documentation designer, consumes the approved suggestion backlog.
  • Presentation Designer
    Identify a formal presentation designer. Again, due to the centralized nature of XML transformations, a single point of contact is needed to define a common look and feel for topics and various presentations (typically PDF and help). Just as in traditional desktop publishing systems, an esthetic eye and an understanding of information architecture is essential. This person should own the look and feel that feeds off the schemas, though not necessarily make the required technical changes to schemas, templates, and XSL transforms. In a previous company, I had a document designer function as a “product owner” of presentation in an Agile development group. The engineers took in requirements from the designer and the designer in turn had to sign off on the work for it to be considered complete.
  • Centralized Documentation Services
    In larger companies, help build functions can be spread throughout the divisions of the company, each product group doing its own help builds. I have found that up to fifty percent of an engineer’s time can be spent just dealing with help build issues. Does it really make sense to have multiple, unique build systems producing essentially the same kind of output? Centralizing the help builds from a common XML source can free numerous engineers in a larger company to do other, more development-oriented work. If you are lucky enough to have a dedicated Documentation Services group, I highly recommend an Agile approach to tracking requirements by quarter if you have numerous, overlapping schedules or by product if you have only one product.
  • Computer Science Interns
    Instead of having highly paid engineers or writers performing conversion grunt work on legacy documentation, hire summer interns or take advantage of ongoing intern programs throughout the year. Many computer science students have the requisite skills to do clean up tasks and find automation solutions to common problems in legacy documentation as it is converted to XML. One or more full time engineers along with a few interns can form a highly productive Documentation Services Engineering team that will free writers to concentrate on content.
  • Translation
    Work with the translation department to ensure that they can work effectively with XML deliveries. Instead of receiving chapters or books, they may now be receiving individual topics. I have found that to reduce unnecessary translation rework due to product changes, it is best to have writers indicate the readiness of topics for translation. If translation deliveries become too microscopic, triggering every time a change is made, costs will only go up due to content rework. Topics should be eighty percent or more complete prior to delivery.

Implementing XML and structured documentation may seem like a complex task–and it is. A successful implementation touches all aspects of the documentation team. It is less about a tool and more about a whole way of approaching documentation. The benefits of automation, speed, flexibility, and the cost savings make the transition worth the effort. The larger the company is, the greater the benefit.

I’ve found the best strategy is to start small with a prototype or a single project that can measurably benefit from the approach. Establish a baseline of success and continue from there to cover other projects according to reasonable schedules. Once executives realize how much money is saved on the first project, it will not be difficult to garner buy-in for other projects. Whether you perform the work in house or hire consultants, it is important to take control of your documentation architecture and know exactly what problems you want to solve. With writers and engineers working together, sharing their respective expertise, a truly responsive documentation system is possible, one that will better serve customers in this age of increasing complexity. CIDMIconNewsletter

About the Author


Troy Klukewich
Independent Consultant

Troy Klukewich is a senior manager, documentation architect, and technical writer with extensive experience in enterprise software, XML, and structured documentation. He has worked for Borland, PeopleSoft, and on a number of SAP implementation projects for PricewaterhouseCoopers.