Playing Hide and Seek with Your Topics



August 2008

Playing Hide and Seek with Your Topics

CIDMIconNewsletterFrance Baril, Architextus, Inc.

This is the story of the vanishing topics and of those that no one knew about.

Moving to modular content with DITA often means saving time and money with better content reuse and reorganization. It may mean better usability through consistency in form and content. It may mean increased collaboration within a team, enhanced work processes, and faster time to market. The word “may” in front of all these promises is key. The realization of these promises depends on each business context but also in the way DITA is used and implemented.

Moving to DITA often means moving from an environment where each deliverable is a book made of one or a few files owned by a single author to an environment with thousands of topic files that can be reused by many authors for many deliverables. Dealing with thousands of topics has its own unique challenges and impacts the choice of tools because it affects business processes at their core.

The story of the vanishing topics is a way to illustrate one of the challenges of moving from book to topic management, but it also provides an occasion to restate the importance of identifying new needs and defining new processes before selecting tools and moving on with real life implementation.

As XML, DITA, and the systems that support them evolve, what used to work may not be worth still carrying around.

Hiding! Naming Conventions and Folder Structures

It’s easier to hide a small ant in a huge colony than an elephant in a small herd.

There was a time when efficient storage meant a good folder structure either organized by subject area or by department with a standard file naming convention. In those days, each person was responsible for only a few books and cared little about what others were writing. In a world where reuse is key to efficiency and where each writer owns pieces of content instead of books, being able to find existing topics becomes very important.

However, with thousands of small topics, as a writer, you face new challenges:

  • How do you know if the topic that you are about to create already exists?
  • Even if you know that the topic exists, how do you find it?
  • Is it possible to create a hierarchy where a topic can be stored in a unique location, where you always have enough information about the topic that you are looking for to figure out where it’s stored, if it exists?

Moreover, in the world of agile-like methodologies, just in time delivery, and product development, nothing is less certain than what has been agreed upon today.

  • Product and feature names keep evolving.
  • The structure of organizations changes with mergers, downsizing, and rapid growth.
  • In a global working environment where original content may be created in other languages and then translated to English, not everyone is able to follow the English-based naming conventions.

Is it possible to store topics and have an efficient way of retrieving them as topics multiply and as structures, whether organizational structures or product structures, evolve?

Seeking! Search Engines for XML

Some tracking systems are better than others at finding a hidden ant in a colony.

Because few can find very specific information on the web by navigating a tree structure (big thanks to Google and other search engines), why are documentation departments still working with hierarchical file systems when it comes to finding and retrieving their own content?

Using tags, XML is able to identify every piece of information stored in the topic, such as its author or the product for which it was first created. Consequently, writers can start to organize content into multiple categories instead of following a single organizational tree. For example, a topic that describes a procedure for an airplane’s electrical component could be stored under troubleshooting if it is used when specific problems occur as well as under monthly maintenance if it should also be performed regularly, and even under North America if it is required by law in specific regions. Most of this information is referred to as metadata, simply defined as information about information.

Once properly tagged and enhanced with metadata, topics become easier to find even as the context of use differs. Content creators can navigate virtual folder hierarchies based on the needs of the moment. More importantly, the metadata can be passed along to end-users.

What once needed to be classified in mutually exclusive categories can now appear wherever you search for it.

You must still be careful when creating a classification system and take into account the overhead of adding all this metadata. However, it does make the system much more flexible, and some of the metadata, such as the date of the last modification and the name of the author can be managed automatically by most content management systems.

More often, finding content, even during the creation process, means looking for information based on multiple criteria and ordering results based on specific needs.

Virtual Folder Example—Managing Priorities

This example illustrates how virtual folders based on data and metadata outperform a unique hierarchical organization when it comes to managing a content creator’s workload.

Let’s imagine that I am a writer and that I want to work on some topics. I work with a CMS that provides a flexible search engine based on both data and metadata. Some topics have been prioritized based on their importance for the upcoming deliverables.

First, I am going to search for all content assigned to me and order it by priority:

Priority 1

Baril_Priority 1

Priority 2

Baril_Priority 2

Priority 3

Baril_Priority 3

After trying to edit two of the Priority 1 topics, I realize that I cannot work on them because they are related to a product to which I don’t have access at this point.

I decide to order content based on priority and on products, so that it becomes easier to identify what I can work on today:

Priority 1

Product A

Baril_Product A

Product B

Baril_Product B

Key elements:

  • Content retrieval is faster and adapts to the context of use.
  • The content creator is not required to learn the logic of a complex tree structure to retrieve and reuse content.
  • Search can be based on system metadata and/or workflow data even as they evolve.
  • Content may appear in many locations simultaneously, if relevant.

In this example, because the topic “Sorting YZ” is used in both product A and B, it appears twice as a Priority 1 in the second sort example.

DITA and Today’s World

While moving to DITA to reap the benefits of modular writing and reuse, new challenges arise and new tools become available. It becomes important to question the old ways of doing things. Some that used to work very well may become obsolete in this new world. Documentation teams need to stop managing as if they were still working with huge elephant books and learn to work efficiently in the new topic-based ant colony. CIDMIconNewsletter

France_Baril_SmallFrance Baril

Architextus, Inc.

France Baril, owner of Architextus Inc., is a DITA/XML consultant as well as a documentation architect who helps organizations analyze their content and processes, select tools, learn about DITA and/or XML, manage the change process, and develop supporting material (from DTDs or schemas to XSL transformations).

She has a unique background with a B.A. in Communication from University of Ottawa and a B.Sc. in Computer Science from Université de Sherbrooke. She worked as a Documentation Architect for 5 years at IXIASOFT, where she served as Product Manager for their DITA CMS Framework. Before that, France has worked as a multimedia developer, a trainer, and a technical documentation specialist.