Editing Strategies for Content Authored in DITA

Home/Publications/Best Practices Newsletter/2005 – Best Practices Newsletter/Editing Strategies for Content Authored in DITA

CIDM

December 2005


Editing Strategies for Content Authored in DITA


CIDMIconNewsletter Ronnie Seagren, IBM Canada Ltd.; David Steinmetz, IBM Corporation; Donna Sutarno, IBM Canada Ltd.

New technologies don’t always come with the right tools and clearly defined processes for ensuring quality. For an organization to reap the XML advantages of using Darwin Information Typing Architecture (DITA) technology, writers need guidance on task analysis, navigation, information architecture, retrievability, content reuse, and indexing. IBM editors are taking an active role in standards development and creating practical strategies for quality assurance to support these initiatives. This article examines the challenges and on-the-ground solutions for editors working with XML-based information development systems, such as DITA.

DITA and the Editing Role

Editors play a key role in ensuring the quality of your documentation. When your organization introduces large-scale process changes, as with new content management systems or documentation technology based on Darwin Information Typing Architecture (DITA), editors can keep your documentation quality on track.

During our rollout of DITA, we found that the editors increased consistency and improved the retrievability of information. The editor champions the end user of the information, while offering a broad familiarity with the product documentation and the documentation process, tools, and style standards.

For example, editors work with information architects to create your information structures. They guide writers in using new tools, such as DITA, to get the required results, and they keep user needs before the writers. They establish quality standards to ensure material is written as well as possible the first time, and they verify the results.

As our writing teams began to use DITA, a number of issues arose. The strategies presented in this article are based on our own experiences, so you may need to adapt them to your own DITA implementation.

Types of edits
At IBM we do several types of technical edits. Editing priorities depend on customer needs, writer experience, and available resources. Many edits involve more than one of these types.

Structural or developmental edits help writers define an information unit or determine an improvement strategy based on a task analysis and outline of the content.

In copy edits, editors review the information from the user’s perspective. They query issues and mark up the text for grammar, spelling, minimalism, consistency with our corporate style, ease of translation, formatting of output, and appropriateness of graphics and labels.

In legal or policy edits, they check trademarks and attributions, notices, logos, copyright statements, product names, licenses, and other issues.

Editors do user-interface edits to ensure that software interface elements conform to product standards and guidelines. They might flag the amount of content on wizard pages, the relevance of message content, or the consistency of product terminology.

In verification edits, editors check the ease and speed of finding specific information using online navigation aids, search, the index, links, and cross-references. They can also verify metadata, check for accessibility, and test the procedures.

This list of edits implies a lot of editorial resources. We are moving toward meeting the industry standard of one editor to seven to ten writers, but editors can’t get to everything. Peer reviews can extend the influence of standards.

Quality characteristics
A key element of an editing strategy is a clear image of what quality means for your information. IBM uses the nine quality characteristics defined in Developing Quality Technical Information: A Handbook for Writers and Editors. The first three elements focus on the ease with which users can apply the information-task orientation, accuracy, and completeness. The next three indicators focus on ease of understanding the information the first time-clarity, concreteness, and style. Three more characteristics concern ease of finding relevant information quickly and easily- organization, retrievability, and visual effectiveness. These nine aspects interlock like puzzle pieces to form useful information.

Editors collaborate on quality edits for key deliverables to rate quality, establish baselines, and help writers improve. They score the quality according to a defined procedure and conduct follow-up edits to assess improvement. There is interest in these quality edits across the company.

DITA Features and Editing Strategies

DITA is an XML-based, extensible, end-to-end architecture for authoring, producing, and delivering readable information. Information is authored in discrete, typed topics, which can be validated, organized, reused, programmatically linked, transformed, and delivered in multiple formats.

To better understand DITA, look at its foundational emphasis on information architecture, features for navigation and retrievability, broad content reuse capabilities, and two-stage approach to content tagging. The sections below briefly describe these DITA features and noteworthy issues or solutions.

Information architecture
DITA content is organized into topics, and topics are categorized by information type. Fundamental types are concept, task, reference, mixed, and generic. Other, more specialized topic types can be created as well.

A topic is defined as an independent unit of information that is meaningful when it is displayed alone. These topics provide the building blocks for sets of information provided in multiple output formats.

To take full advantage of the DITA information architecture, each writer should create a task analysis and an initial organization of topics to match the needs of the user doing the tasks. The writer creates content to fill out the task-based structure.

Writers should perform these activities:

  • Cover all of the customer scenarios in the task analysis.
  • Select the appropriate topic type for the content.
  • Write content using topics of appropriate length.
  • Follow the style and content guidelines.
  • Focus on creating the content using DITA support for authoring the different information types.
  • Write titles and short descriptions that can be reused in links.

Early in your documentation effort, consider the information architecture. Depending on the team, an editor or an experienced writer filling the role of information architect can work with writers to review their task model and set up their initial set of topics, including the TOC structure.

Depending on your team’s experience with writing topic-based information and using information types, you might want to allow extra time for writers to learn and adapt. Also consider a developmental edit that looks at the amount of information in topics, whether the content is appropriate to the type, and whether the organization follows the task analysis results. You might need to make changes to your house style guide if you are putting more of your information online or on the Internet or an intranet.

Navigation and retrievability
In DITA, maps are used to organize groups of topics and set up types of collections. Maps can also be used to set up linking relationships among topics, including parent-child relationships and prerequisites. To set up your links to related information that is not part of the navigation hierarchy, use relationship tables. For example, you can point to concept or reference information that supports a given task topic, guiding the user through a complete explanation of why and how to do something, to accomplish his or her goal.

The powerful linking options of DITA can ease the generation and maintenance of links, but we also found that this technology was often misused. In some topics, a large number of links were generated as though the writers were linking almost indiscriminately. This can result from using the “family” collection type, which links all sibling topics. In other words, if a writer sees that some topics are closely related and makes them a family in the map, the result can be many links that are of little value to users. Alternatively, the user can access these topics, if they are of interest, by using the navigation pane and closely related links to the current topic. Because writers cannot see the links that will be generated until a transform is performed, they must view the output early and pay careful attention to their links.

For proper use of DITA linking, writers and editors must start with a firm grasp of corporate linking guidelines and best practices and then improve their knowledge of DITA’s linking capabilities. They need to understand maps and relationship tables, roles (such as parent, child, sibling), information types (such as concept, task, reference), collection types, and hierarchies. Editors can ease this education process by clarifying how existing guidelines shape recommended DITA linking practices.

Related links play a key role in retrievability, but they are often considered late in the writing process or given less focus. Relationship tables are a logical continuation of task modeling and mapping. Once established, they ease link maintenance because the related links, including text, are automatically updated to reflect changes. In contrast, direct use of the related links tag creates a hard-coded link that is more difficult to maintain. Unfortunately, an editor who works with the output cannot tell whether the related link was generated or hard-coded. Consider clear guidelines for writers or include source review as part of your edit pass. We hope to develop tools to allow us to display link patterns visually.

Sometimes new tags cause so much confusion and frustration that they require their own guidelines; this was the case with short descriptions. Text in the short description tag should explain the purpose of a topic and orient the reader. Short descriptions are the first paragraph in any topic. This text also displays under the link in a parent topic and in the related links as a link preview.

When writers first used this tag, they repeated the title, used sentence fragments, and created text of wildly different lengths and styles. These issues became much more noticeable when the text was displayed in parent topics or link previews. Poor short descriptions result in inconsistency and poor retrievability.

A workgroup was formed early to create guidelines for content, length, and structure of short descriptions, including samples. However, writers continued to have trouble implementing this tag consistently, perhaps because of its special use. Refresher classes on this subject may have improved consistency because editorial comments in individual topics did not work well.

Title customization is another feature that can be an aid for retrievability. In addition to the standard topic title, DITA provides navigation and search title tags. With these alternatives, topic titles displayed in the navigation pane can be customized to be more usable and titles returned from a search can be revised to provide additional context. For example, a title appearing in a search result can include more detail to help distinguish it from other topic titles.

These title tags can be implemented in the map or the topic. We recommend that all tags be contained in the topic to ease comparison and because DITA uses the titles set in the topic as the default. Although custom titles are a valuable aid to retrievability, writers rarely added them and most editors did not consider their use. Some teams did perform title “scrubs” shortly before content freeze to find and implement the most obvious changes.

A final aspect of retrievability is indexing. Initial decisions include the number of subentries you will allow, the output formats you need, where you will store the entries, and how you will combine existing index entries with DITA.

Our first step was to establish new corporate guidelines aimed at consistent index entries. Because online indexes can be quite large, it is important that different writers index the same elements with similar wording. We decided to keep index tags in the topics.

Index entries are stored as either metadata or index entries, depending on your software. We decided to keep our entries in the topics to make it easier for writers to see all their entries for a topic. Our entries are at the end of paragraphs because we need to output selected smaller indexes with page numbers in Portable Document Format (PDF) and display the larger online indexes as links to topics. DITA repeats the same <indexterm> tag for the main entry and subentries, such as:

<indexterm>features<indexterm>installing multiple </indexterm></indexterm>

<indexterm>installation<indexterm>mul tiple features</indexterm></indexterm>

Because of our need for additional indexing tools, we are still in the process of establishing indexes.

Content reuse
A topic from one map can be reused by referring to the same topic in another map. It will then appear in the navigation pane in more than one location. You can also reuse a topic in the same map.

In addition, you can use the conref attribute, together with the id attribute, to reuse content from almost any content-related tag; for example, <p> (paragraph), <keyword>, <fig> (figure), and of course, <topic>, to reuse the entire topic.

In general, teams reused content less than they could have, so few issues emerged. Some writers reused their own topics, and some writers reused topics from within their team. For editorial review, reused topics were marked to show that they would appear elsewhere or had already been edited. In the future, when reuse is prevalent and teams share content, editors will need to learn to track content that is reused to ensure it has been edited and to check its appropriateness in context. New guidelines will be needed to sharpen our modular approach to information so that content can be reused without confusing the user.

Several teams used the conref attribute to reuse phrases. A common file included product names and other common strings. In this way, writers could reference the product name and the latest version of the name would appear in the generated documentation. However, an editor working only in the output cannot determine whether the name was entered as a variable or as hard-coded text and so cannot ensure that it is easily maintainable.

Tagging
DITA, based on XML, separates form from content. Tags are applied to content according to the purpose and meaning of that content, and in line with the structure determined by the DITA document type definition (DTD). The final presentation of the information depends on the format targeted, including XHTML, PDF, or HTML Help.

Traditionally, writers focus on the source and editors pay attention to the output; however, both merit review. Once the text is finally generated and formatted, it may not appear as expected. For example, we had trouble with generating the bold word “Prerequisite” for the prerequisites section. If you produce PDF files, you should check to make sure your links are displaying as you intended.

For some help reviewing troublesome tag issues in the output, use a customized
Cascading Style Sheet (CSS). For example, the <wintitle> element is used for window titles, but it does not format the text. By creating a rule for the class selector wintitle, you can highlight correctly tagged text and identify missing tags.

Editors need to understand key tags so they can educate and guide their writers, develop guidelines where appropriate, and make sure that the documents are consistent, clear, and effective.

Editing the source and output
You can make your edits in DITA, on paper, or in some other electronic format. Editors need to work with their writers to determine the best approach for everyone. When you edit in the source, you can see the topic title
alternatives writers have chosen. Writers may also be able to recover the edits faster and more efficiently. When you edit the output, you can identify things that can’t be seen in the source, such as generated links.

The best solution, and the way to be most effective for your writers, your company, and your users, is to look selectively at both the source and the output. From the output, you will see the text as the user does. From the source, you can find the root of output problems and make effective recommendations for improvement.

Summary

As you adopt DITA technology and revise your processes, you will find that an editor’s assistance with good standards and guidelines is essential.

Involve editors early in the adoption process and leverage their perspective, which considers the end-user viewpoint while maintaining a broad knowledge of the tools, standards, and information set.

Finally, support editors as they push for tool changes and uphold corporate style and content standards. CIDMIconNewsletter

About the Authors

seagren sutarno steinmetz