Book Review: Topic Maps 101
Written for the intrepid information architect or anyone with an interest in XML, XML Topic Maps: Creating and Using Topic Maps for the Web attempts to demystify yet another XML specification: topic maps. Edited by Jack Park, each chapter in the book is written by a different leader in the topic map industry and attempts to ease you through first the concept and then practical applications of topic maps.
I suspect that the confusion around topic maps occurred for the same reason that SGML and XML seem so confusing. The markup (SGML/XML) experts learn so much that they begin using their own shorthand and their own language until, eventually, they are incomprehensible to the uninitiated.
The problem is illustrated brilliantly in a story that Steve Newcomb tells in chapter 3. Yuri Rubinsky decides to create a movie that will once and for all explain what SGML is and why it’s so important. The central plot of the movie revolves around a document that Earth sends into outer space. Aliens intercept it, assume that it’s toast, and try eating it. After the document is structured and marked up in SGML, Earth resends the document, but the aliens can now understand the document because it includes information about the information (that is, this is a document, not toast).
Steve decided to use the movie to explain to his colleague what SGML is. After showing the movie to him, his colleague, in a very serious manner, says, “Oh, I understand now! You’re trying to communicate with aliens!”
Obviously, markup isn’t powerful because we can use it to talk to aliens, but unfortunately it seems that the very people tasked with explaining the concepts of markup have become aliens and are incomprehensible to those who don’t understand it. Hopefully, in plain English, this article can shed light on some of the key concepts of topic maps and show how they might apply to technical publications.
What Is a Topic Map?
First of all, a topic map is a specification. It is a standard way of representing information about your content and works a lot like a template (or an XML DTD). The specification states what information is needed and how it should be represented in XML to be considered a topic map.
Additionally, it is a way of representing relationships between topics.
For example, we could pick a topic and call it New York. What is New York? You know what New York is through its relationship to another topic, City. Now you have two topics: New York and City, that are connected through their relationship “is an instance of.” See Figure 1.
Figure 1. Example of a Topic Map Association
The topics and relationships are then represented through XML topic maps (XTM) and might look like the ones shown in Figure 2 (Park, page 28):
Figure 2. Example of XTM Topic and Relationships
Without getting into too much XML or XTM markup (and risking being labeled an “alien”), I want you to understand that you can create topics and relationships between the topics, and that information about the content is maintained outside of the document in an XML topic map.
The example given above is a fairly simplistic one, but topic maps can be very complex and may overlap with topic maps from other industries, which brings us to…
That’s All Fine and Good, but What Do You Do with It?!
Many in the topic map community see it as part of the fulfillment of Tim Berners-Lee’s vision of the Semantic Web. The Semantic Web allows software applications to exchange structured data, in addition to allowing humans to read content, through the Web.
Using Berners-Lee’s own example of what the Semantic Web may look like: After an appointment with her doctor, Lucy may find that she needs to make a series of appointments with a physical therapist. Using a handheld device, Lucy’s Semantic Web Agent contacts the doctor’s Semantic Web Agent and downloads the prescribed treatment. She then states that she wants a physical therapist within a certain distance of her home and the Semantic Web Agent finds a list of physical therapists within that radius, accesses a therapist’s calendar, compares it to Lucy’s calendar, and automatically schedules the appropriate number of sessions (Scientific American 2001).
Without consistent structure to the data, relationships among data, and the ability to translate between the doctor’s data and Lucy’s data, it would be impossible to fulfill the dream of the Semantic Web. Where XML can structure data, topics maps allow you to create relationships between the data and translate between two different data models.
The ability to create relationships is powerful, but how does it help a technical publications department?
An area where technical publications could take advantage of topic maps is in creating and linking to “Related Topics.” I attended an interesting session at the Extreme Markup conference last year by Nikita Ogievetsky and Roger Sperberg that illustrated how topic maps can be used to create related topics dynamically.
As explained in the previous article in this issue, Aspen BookBuilder is a New York book publisher that creates AnswerBooks for the legal industry. The books consist of questions and answers marked up in XML and stored in a database. Using the existing table of contents, references to IRS publications, references to other books, and the indexes (which included “See Also” references) from a variety of AnswerBooks, they are able to build the associations for existing topics in a topic map.
The demo showed that, based on those associations (associations that already existed in the documentation!), they were able to dynamically create a list of related topics, if they existed, for any question and answer in the book.
Creating a Table of Contents
One of the difficulties of creating modular XML content is how you bring it together into a deliverable. You may have a series of procedures, concepts, and reference items, but how do you bring them together into a user’s guide?
Traditionally, you use a manual build process. You create a “book” by inserting the modules in the appropriate sequence using a reference or a link to the appropriate module.
But if a writer has a content plan, which means he or she knows what modules need to be written, and if a topic map has been created to relate certain topics, the manual build process could be bypassed. The “build” is handled by the topic map and a software application that understands topic maps and the required build process.
Let’s look at the example shown in Figure 3. It shows a list of topics for a handheld device to be included in a user’s guide.
“Getting Started” and “Using Software” are topics and both instances of a “chapter.”
What’s in the box
Setting up the device
Creating a document
Creating a spreadsheet
Figure 3. List of Topics
“What’s in the box” and “Setting up the device” are topics, and both are instances of a “section,” and, in this example, are referenceable XML modules in your repository.
“Chapter” is a superclass to (or parent of) “section.” And “Getting Started,” which is an instance of “chapter,” is a superclass to (or parent of) “What’s in the box” and “Setting up the device,” which are instances of “section.”
And so on… (There’s a great tutorial in chapter 6 if you’re interested in learning more about how associations are created.)
The topics and their associations can be very complex, even in this simplistic example. What comes out of these topics and associations is the context (or table of contents) for a book (or any other deliverable).
Although associations and book building are among the promises of topic maps, the tools (software applications) needed for this processing, especially for print books, are rare if not non-existent. The tools that support topic maps aren’t as advanced as we would like, but as the industry evolves, they show great promise.
So What Should You Do?
Although the possibilities for applications of topic maps for technical publications are infinite (I haven’t touched on many of the possibilities for simplification’s sake), it may be too soon to do anything. The industry is still immature and the tools needed to support many of the innovative ideas don’t exist yet.
About the Author
Comtech Services, Inc.
Tina Hedlund is a Senior Consultant with Comtech Services, Inc. Tina has worked with companies in North America and Europe, helping them solve their information-management and content-management problems. She has also been heavily involved in many benchmarking projects related to information management and process improvement. Tina has focused on the technical aspects of content management and single-sourcing processes, in addition to identifying trends and best practices in the industry through a series of benchmark studies.