PG Bartlett, PTC

Do you remember how you first learned about DITA? Do you recall when the realization dawned that this was something really important?

That happened for me about four years ago when Dave Schell, an IBM executive who championed the development of DITA, mentioned to me, in a way that now seems almost off-hand, that I should go out to their developer site and take a look.

As I read through the site and learned about the “specialization” feature of DITA, it occurred to me that DITA answers a problem that had long bothered me about SGML and then XML: the “brittleness” of the data model (i.e., the DTD or schema).

A “brittle” data model means that if you change it, you have to change all of your downstream applications to accommodate that change. With the world constantly changing and business needs continuously evolving, how could XML-based publishing ever become mainstream if it couldn’t adapt easily to match the pace of change? DITA’s specialization feature delivers the solution in a deceptively simple manner.

Another problem that DITA addressed was information “granularity,” which refers to the extent to which you must break up your information into discrete components. There are many reasons for handling information separately, including variations in formatting, organization of the document, linking, authoring, reuse and assembly.

DITA solves the problem of granularity almost completely. For authoring, assembly and much reuse, DITA established “topic” as the primary unit. For formatting, linking and other purposes, DITA established a set of tags derived from HTML. DITA also permits reusing information at a finer level of granularity than topics, which offers flexibility where no prescriptive approach could solve every situation.

The last important aspect of DITA to fall into place for me was its approach to information architecture. By establishing “task,” “concept” and “reference” as the primary types of information objects (i.e., topics), DITA settled that question.

Despite the attractions of DITA, the lack of customer interest presented a huge barrier to convincing myself that Arbortext should invest significantly in supporting it. Even though IBM was and remains an important customer, how could we be sure that our investment would pay off?

So in September 2004, we started by adding the bare minimum to Arbortext Editor (known as Epic Editor at the time) to meet authoring requirements. We made specializations work properly, and we added just enough support for what we call “custom tables,” which simply means that Editor can display any selected tags as a table instead of being limited to specific sets of tags such as CALS and HTML.

I had started writing about DITA for our monthly newsletter in early 2004, figuring that by helping bring attention to DITA, we could generate customer interest which would in turn justify additional investment. But more than a year passed before enthusiasm for DITA finally increased to the point where it was obvious. Although that year felt like an eternity, in retrospect, it was impressively fast.

Now that we finally had a strong business case for investing in additional product support for DITA, we started to define the details of what that support would involve. Because of the breadth of our products, we needed to support the entire process of authoring, managing and publishing DITA information – a significantly bigger project than any other vendor had to take on.

But we wanted to go further. We could see that DITA would eventually become the predominant data model, at least for technical documentation and possibly for a much broader range of document types, so we wanted to add features well beyond the minimum. For the first time since we added extensive DocBook support in the late 1990s, we felt that we could justify adding features to the core product that specifically support a single data model, instead of always trying to meet the broadest set of needs with the lowest common denominator.

Briefly, the project goals included:

  • Simple UI for authoring topics – we wanted to make the task of authoring topics, which includes writing, inserting reusable components (including graphics) and linking, to be easy and intuitive.
  • Simple, powerful UI for authoring maps – the “map” is simply a list of information components that make up a finished publication. (You can think of a map as a bill of materials for a document.) We wanted to make the process of building a map, which includes adding existing topics, creating new topics, and building link tables (which DITA calls “relationship tables”), easy, intuitive and fast.
  • Master document support – from a master document, you can generate virtually limitless subsets of information so that you can tailor your information delivery precisely to the needs of each person who needs it. Arbortext has long provided a feature called “profiling” to meet this need, but we needed to hook it up to the DITA approach.
  • Publishing support – to produce the finished publication, our software needed to be able to assemble a finished publication according to the map, allow the user to define the styling of that finished publication for each media type, both print and electronic, and then produce the actual publication in print and/or electronic forms.

With these goals, we set up a cross-functional team to flesh out the requirements. We also set up a customer advisory board to help us prioritize requirements and review our designs, and their feedback proved invaluable as the project proceeded. In particular, they helped us avoid over-complicating the solution while retaining all the flexibility that DITA offers.

As we came to grips with the implementation of our vision, we realized that deep support for DITA would be deeply complex for our developers, and therefore much more work than I had originally anticipated. We were in the gratifying position of being able to apply more resources, but the entire project still took more than a year.

Because simplicity was one of the project goals, we invested a lot of time in the design of the user interface. In the first internal releases, we built in lots of variations of the same controls so that we could try them out and compare them.

One of the innovations that came out of this work was the horizontal cursor (enlarged and pictured below). This cursor shows the current insertion point in the map, which makes it easy to see and control where an inserted topic goes in the hierarchy: as a peer, subordinate, or superior. In the example pictured, if you inserted a topic at the current insertion point, it would be subordinate to the topic above it. To insert a topic as a peer, you would first use the left arrow to move the cursor to position it on the vertical line.

You may think the size, shape, color and operation of the cursor is obvious, but we experimented with many different styles before we hit upon the right one.

Another innovation was what we call “Column View.” At its simplest, the Column View shows the map in an “Outline” mode, as pictured below:


To add topics to the Outline, you browse the file system or content management system through the new “Resource Manager” on the left, and either drag and drop them to the Outline or click on the Insert button.

You can change the Outline view to Column View by right-clicking the mouse in the header. The result gives you quick and simple access to the attributes of each topic, as the following screen shot illustrates. In this case, the filename, topic type, scope and linking attributes are all displayed. These columns work like a spreadsheet, so you just position the cursor and either start typing or select a value from a drop-down list.

To edit a topic, you double-click on it in the map and you get a new window as pictured below. When you want to insert a reusable component of information into a topic that you’re editing, the Resource Manager lets you not only browse to individual topics, but also to browse to elements within an individual topic (as pictured on the left below).

To meet the publishing requirements for DITA, we had to provide a means of defining the layout and formatting of the finished publication. We have a product called Styler which provides a graphical user interface to design stylesheets, and we enhanced it to handle the specialization mechanism of DITA.

The following screen shot shows a couple of examples (each with an orange oval around it) of how Styler gracefully handles specializations. In the top example, the Style column indicates that the “amendments” tag is a specialization of the “Paragraph” tag and that because it is “unstyled” (in other words, because it has no styling of its own), it will inherit its styling from the tag on which it is based. If you wanted “amendments” to have its own style, then you could easily apply specific styling to it.

The second example shows how Styler displays a tooltip that shows the relationship between the specialization of a tag and the tag itself.

The final component of our development work was in our integration with content management systems which we accomplish through “adapters” between Arbortext software and each CMS. Because DITA introduces a new linking mechanism, we had to extend our adapters to handle these new mechanisms. In the current release, the only adapter we extended was for PTC’s own content management system (Arbortext Content Manager). We will upgrade our adapters to IBM’s and Documentum’s content management systems to handle DITA in a maintenance release that we expect to be available in the first half of 2007 (but because I work for a public company, I have to add that this date is subject to change).

All of the functionality that this article describes comes in the Arbortext 5.3 release, which started shipping on December 29, 2006. In future releases, we plan to continue to enhance our support for DITA to extend its functionality, improve ease of use, and add more features in support of technical documentation and other document types.