Eric Severson, Flatirons Solutions, Inc.
Only three years after becoming an official standard, DITA has already taken over the world of technical publishing. Now, it’s being explored for e-learning, enterprise business documents, and many other applications. With this much momentum behind it, you’d think DITA—and the best practices for using it—would be well understood. In reality, that’s not always true.
In this series of articles, we’ll attempt to demystify this standard by exploring best practices for DITA from three specific points of view: that of the author, that of thecontent manager, and that of the globalization manager.
Thinking in Topics
For the author, the most important thing to understand about DITA is that it thinks in terms of topics—not books. Topics are smaller, more focused pieces of information—typically akin to a section or sub-section in a book—that potentially might fit into avariety of books or publications.
While books are monolithic and relatively inflexible, DITA derives its power from creating a set of well-organized, ready-to-use information objects—and then allowing these to be mixed and matched in any way needed. Through a DITA map—which is akin to an outline or table of contents—actual publications are assembled from the appropriate set of underlying topics.
To make this idea work, there are a number of rules that must be followed by the author. Otherwise, the information won’t be well-organized, and assembled output may be choppy and confusing.
First, each topic should have a specific subject and a specific purpose. In DITA, purposes are organized into three standard categories: concept, task, and reference. For example, the subject might be “Memory Card” and the purpose might be to describe how it works. This would be structured as a concept topic. Other topics could each describe a task related to the Memory Card, such as inserting, removing, or troubleshooting. Finally, there might be a reference topic that contains a table of the Memory Card’s detailed specifications.
Second, DITA topics should be written so that they are reusable and self-contained. This means that topics can’t have an assumed order or context, since you don’t know where they might be included in any particular publication. They shouldn’t say things like “as described in the previous section,” or even assume that a previous section has been read. Instead, any key prerequisite material should be included in a DITA related link, which provides a specific cross-reference.
For maximum reusability, context-specific terms should be generalized. For example, an upgrade and replacement procedure for the Memory Card may be virtually identical, except for the use of the phrases “upgrade memory card” vs. “replacement memory card” in the two procedures. If these were generalized to simply “memory card,” then two topics could be collapsed into one and reused across both contexts.
Handling Variations in Otherwise Reusable Topics
Of course, things get even more complicated in the real world. As it turns out, the majority of opportunities for content reuse don’t involve exact, 100% redundancy. Instead, they occur where content is, or could be, almost the same—but not exactly.
For example, the following two sentences might have been written by separate authors to express a similar idea:
To start the server, click on the Launch Server selection in the File menu.
Go to the File menu and select Launch Server to start the server.
We could easily make a single reusable version of this, perhaps:
To start the server, go to the File menu and select Launch Server.
The following two sentences, however, could not be collapsed in the same way:
To start the server, click on the Launch Server selection in the File menu.
To start the server, enter “run serverutil –s” on the command line.
That’s because these sentences assume completely different platforms. But if most of the topic content is the same, and only the specifics of these commands differ, it’s still possible to collapse these into a single, reusable topic. In DITA, this is done by including both alternatives with appropriate filter attributes (e.g. platform = “windows” vs. platform = “unix“). Upon publication, a profile is set—one of the values of which is “platform”—and the content is filtered appropriately.
Reusing Content in the Middle of a Topic
In general, content reuse is at the topic level. Sometimes, though, it’s useful to reuse a piece of content that occurs inside a topic, and DITA provides a mechanism for this called conref (content reference).
Think of a reference library consisting of hundreds of error messages. Rather than having a separate topic for each and every message, the library organizes messages by category into DITA reference topics. For example, an Import reference topic may contain 20 error messages, all related to importing data.
In your document, you want to reuse the standard text for a particular import message—not all 20 error messages. Conref was designed exactly for this situation—you just use a conref that refers to both the topic and the ID of the specific error message you want.
This technique works well for anything that can be thought of as a reference library or list, including such things as glossary definitions, part specifications, API definitions, and so forth. But DITA doesn’t limit its use, and in some cases it can be dangerous.
As a best practice, we recommend that conref be used only for content that has been specifically targeted as reusable. That is, where there is an agreement that the object to be included by conref is meant to be—and will continue to be maintained as—reusable.
By default, a well-written topic must be reusable and self-contained. There is no such restriction, however, on content inside the topic. You might want to conref the second paragraph out of a 5-paragraph essay, but the essay could easily be re-structured later into three completely different paragraphs. Then, when your document got thelatest version of the second paragraph, you wouldn’t be happy.
DITA allows topics to contain nested sub-topics, either embedded directly or included via conref. Here too, though, we need to be careful. By calling something a sub-topic, you are implying that it is also a topic, and therefore reusable and self-contained. For this reason, it’s generally not good practice to actually embed topics within each other. Instead, use conref to bring them in from a separate source.
Using conref, however, has another more subtle issue. By doing things this way, you are implying that every time the parent topic is used, all the sub-topics must also be included. This may not be true in all contexts.
DITA provides an even more flexible way to include sub-topics, which we recommend as best practice in most cases. This is by simply nesting sub-topics within the DITAmap. The effect is the same as if you had used conref to directly nest them in the parent topic—but in this case the nesting relationship applies only to that map.
Finally, one of the classic problems with topic-based authoring is ensuring that cross-referencing links don’t get broken. That’s because when topics are mixed and matched across different output publications, there’s no guarantee that the referenced topic is included.
DITA solves this problem by providing the related links element, which we recommend using in preference to cross-references embedded directly within text. This lets you isolate linking to a well-defined area that can easily be checked. DITA also allows related links to be re-defined in the DITA map (through something called a relationship table), so that the link target can be appropriately set for the publication/context described by that map.
In the next article of this series, we’ll look at things from the content manager’s point of view—including a look at topic-based review and approval, DITA specializations, and legacy content migration.
Flatirons Solutions specializes in content management and XML-based publishing and is located in Boulder, Colorado. Eric is a co-founder and Chief Technology Officer.