Lightweight DITA: What Is It and Can I Use It in the IXIASOFT DITA CMS?

Home/Publications/CIDM eNews/CIDM eNews 06.17/Lightweight DITA: What Is It and Can I Use It in the IXIASOFT DITA CMS?

Leigh White, IXIASOFT

One of the biggest obstacles to DITA adoption is its perceived complexity. As of DITA 1.3, the complete element list has grown to over 600; just the base elements plus the technical content elements total more than 400! That’s a lot of elements for users to learn, although of course, few groups actually use all of them. Any good XML editor can do considerable “handholding” for a writer to guide him or her through the construction of a topic or map by offering a list of only the elements that are valid in the immediate context. Well-constructed topic and map templates can help a lot too by setting up the initial structure that writers only need to “drop” content into. Still, the nature of DITA means there can always be some ambiguity as to exactly which valid element to use in a given context. And too many choices can be paralyzing, especially for content creators who are not technical writers and who don’t have the time or the inclination to learn DITA.

To reduce the complexity, many Information Architects have developed a constrained set of elements for their writers to use, using the standard constraint mechanisms made available with DITA 1.2. However, constraints are not trivial to develop or maintain, and probably not feasible for groups without an Information Architect or the funds to contract the constraint development.

Several CMS vendors have recognized this challenge and offer applications that present a limited element set to users (with the option to expand the list as writers’ comfort with DITA increases). However, there has been no standardization among these vendors as to exactly which elements and attributes to offer, nor among the methods used to constrain the element list.

In comes Lightweight DITA, or LwDITA. At its most basic, LwDITA is a separate (for now) DITA specification that defines a much smaller element set. At present count, LwDITA includes just 47 elements. (A list appears at the end of this article.) This limited element set makes it much easier for writers to start using DITA, as well as simplifying DITA for content contributors who are not primarily writers, such as engineers or support staff.

LwDITA maintains mostly full compatibility with the DITA standard so that interchange of content is smooth. Most LwDITA content is valid in the context of standard DITA, so LwDITA topics and maps can be used alongside standard DITA content with no need to create special transformations or validation. However, there are some LwDITA elements that do not exist in standard DITA—specifically, the Media elements listed at the end of this article. There is discussion about making these elements part of DITA 2.0 but they are not valid in DITA 1.3 out of the box. (Of course, they can be added via specialization.) To ensure full compatibility between LwDITA and standard DITA, avoid using these elements if you intend to combine LwDITA and standard DITA content.

LwDITA did not make it into the recently-published DITA 1.3 specification. There are ongoing proposals to reorganize the DITA DTDs—for example, to separate learning elements and technicalContent elements from the base. We don’t know what the future organization of DITA is going to be and whether LwDITA will be rolled into the base, remain a separate standard that lives alongside learning and technicalContent, or just what.

Is Lightweight DITA really DITA?

DITA, as most people know by now, stands for “Darwin Information Typing Architecture.” The “Information Typing” part has been important thus far. DITA provides multiple topic types—concept, task, reference, troubleshooting, and so on—to facilitate classifying, or typing, content based on its purpose. The elements available in the various topic types are specific to the type. For example, <step> is available only in task topics, as only a task should include procedure information. <properties> is available only in a reference topic, as only a reference should include a list of properties and their values. This relationship between specific elements and topic types and the information each is meant to capture is the backbone of DITA.

With only one topic type, LwDITA appears to violate this basic principle of DITA. There’s no easy answer here. It’s probably best to consider the intended purposes of LwDITA. One is to provide a smaller, simpler element set. Another is to facilitate the interaction between HTML, Markdown, and DITA. Information typing does not exist in HTML or Markdown and for groups whose primary need is to provide or consume content in those formats, one can argue that an insistence on strict information typing is an unnecessary complication with little return on effort.

Here we see one of the primary issues with LwDITA. It’s trying to be a simplified DITA model, adhering to the principles and structure of DITA, but it’s also trying to be a medium of exchange between a highly structured content model and other content models that are lightly structured or not structured at all. In trying to serve these two purposes, LwDITA finds itself in a gray area of being very DITA-like but not entirely DITA in many respects.

Uses of Lightweight DITA

Primarily, LwDITA was conceived as a source for round-tripping to Markdown or HTML5. The tagset available in LwDITA is limited to those elements and attributes that can be reliably and unambiguously transformed to equivalent elements and attributes in HTML5 or to equivalent Markdown. For this reason, LwDITA does not include, or does not fully include, many of the more advanced DITA features.

However, there is no reason why LwDITA should be limited to being a source for Markdown or HTML5.

You might also take advantage of the simplified tagset to use it as “DITA training wheels” for writers new to DITA. After they have become comfortable with the concept of structured authoring, you can switch them to standard DITA. Additionally, SMEs, engineers, or other non-writer casual contributors can use LwDITA as a way to contribute content in native DITA without the overhead of learning the full DITA structure.

In both cases (new writers and casual contributors), the assumption is likely that after they author the content, an experienced writer or Information Architect will then incorporate that content into the standard DITA content set. At that point, all the DITA bells and whistles can be added, such as relationship tables, metadata, additional filtering attributes, and so forth.

That said, there may be organizations whose documentation needs are simple and straightforward enough that LwDITA is a sufficient permanent model for some or all of their content. Or, while DITA content intended for fully-featured online help or user guides might use the standard DITA element set, DITA content intended for wikis, blogs, or knowledge bases might continue to use the LwDITA element set.

Features of Lightweight DITA

DISCLAIMER: The LwDITA specification is not yet finalized. Any of this information is subject to change!

Some notable differences between standard DITA and LwDITA are:

  • Mixed content is not allowed. All text must be in ‹p› element. For example, ‹li›This is a list item‹/li› is not allowed. It must be ‹li›‹p›This is a list item‹/p›‹/li›. This restriction ensures a uniform, predictable structure across content, simplifies reuse, and makes it much easier to develop stylesheets and tools to process the content.
  • There are no CALS table elements (‹table›, ‹row›, ‹entry›, and so on). This means, for example, that you can’t create complex tables with merged cells, or a title, or specific column widths. LwDITA tables are very basic and depend largely on the output medium for formatting. If you’re familiar with and its formatting limitations, then you understand what you can do with tables in LwDITA, as it uses the model as well.
  • There is no prolog metadata (everything is in ‹data›).
  • There are no related links.
  • Only the highlighting domain is available, and only a subset of it (‹b›, ‹i›, ‹u›, ‹sup›, ‹sub›).
  • Only topic is available; there is no concept, reference, task, glossentry, and so on.
  • Only map is available; there is no bookmap.
  • Maps do not have a ‹title› element. The title, if one is necessary, can go in ‹navtitle› within ‹topicmeta›.
  • Only ‹topicmeta› and ‹topicref› are available; there is no ‹topichead› or ‹topicgroup›.
  • Out of the box, the full set of filtering attributes are not available. Only @props is available. Individual filtering attributes can be added as necessary.
  • Overall, attributes are managed as functional groups which can more easily be enabled or removed.
  • Specialization is much simpler.

Compatibility between LwDITA content and standard DITA content

As mentioned earlier, LwDITA is (mostly) a subset of standard DITA, so by definition most LwDITA content is compatible with a standard DITA environment—as long as it is not using the media elements mentioned earlier. You could create a simulated LwDITA topic in a standard DITA environment simply by using only the elements and attributes available in LwDITA and ignoring the others.

Similarly, it’s possible to use LwDITA topics and standard DITA topics together in a standard map, and to use LwDITA maps and standard DITA maps together in standard maps or bookmaps.

The reverse is not true. You cannot use standard DITA topics and maps in a LwDITA environment because that content might include elements and attributes that are not available in LwDITA.

Can you round-trip between standard DITA and LwDITA? In a word, no. Once a LwDITA topic or map has been made standard, it’s potentially going to contain a lot of elements and attributes that have no clear equivalent in LwDITA. There’s currently no mechanism to map a reduction of the very large set of standard DITA elements and attributes down to the much smaller LwDITA one. Likewise, there’s no predictable mechanism for mapping a single LwDITA element (for example, ‹ph›) to its many possible equivalents in different contexts. You might be able to develop a mapping that works specifically for you, but it’s highly unlikely to work as well for anyone else, so there is no attempt to standardize this round-tripping.

“Specializing” in LwDITA

The LwDITA module files (topic.mod and map.mod) define multiple groups of attributes. One such group (defined as an entity, of course) is localization, which includes @dir, @xml:lang, and @translate:

This functional group/entity is then referenced by multiple elements within LwDITA:

Here, you can see that <p> is defined to use the three attributes defined in the localization group as well as additional attributes defined in the filters, reuse, and spec-atts groups, plus the individual outputclass and class attributes.

Defining a new attribute using the standard DITA specialization mechanism requires creating an entity file that defines the attribute, expanding the base or props attribute entity in your shells, and adding the new entity to your list of included domains in your shells. In LwDITA, adding a new attribute is as simple as adding it to the element definition:

Likewise, adding a new attribute group is simple as well:

Adding a new element is equally easy. One could argue that it’s too easy, that it lends itself to too many variations on LwDITA, and that’s true. But DITA’s effectiveness as a standard depends on a mutual agreement among all users to adhere to the defined rules. (Which is true of any standard.) It’s possible to create new elements and attributes and to expand existing elements and attributes in standard DITA without following the rules of specialization.

However, you then run the risk that your content is no longer interchangeable with other DITA content outside of your organization. If you never plan to interchange, perhaps it’s acceptable to you to sidestep the rules for the sake of flexibility and simplicity. The same is true of LwDITA; it’s simply even easier to make ad hoc changes to the standard. One could also argue that as there are so few definition files for LwDITA (five, at current count), it’s easy to make those ad hoc changes and simply deploy the revised definitions files around your organization to ensure that everyone has access to the changes.

Note that I’m not advocating ad hoc changes or sidestepping the rules of specialization! But if one of the purposes of LwDITA is simplification and ease of adoption, then we have to acknowledge that the potential for these types of changes exists and that some groups will doubtless take advantage of it.

Where can I get Lightweight DITA?

IXIASOFT has created a preliminary LwDITA integration package, available at http://cms.ixiasoft.com/downloads/lightweight_dita. This package includes the LwDITA DTDs provided by OASIS as well as a LwDITA plugin for use in the DITA CMS, topic and map templates, and instructions. The OASIS LwDITA DTDs are available on GitHub at https://github.com/oasis-open/dita-lightweight.

If you have more questions about the LwDITA standard or integration, please contact Leigh White at leigh.white@ixiasoft.com

Using Lightweight DITA in the DITA CMS

At present, there are two important considerations.

Localization. Out of the box, the IXIASOFT DITA CMS is set up to add the ixia_locid attribute to elements when you release a topic or map. If it cannot add the attribute, it cannot release the topic or map. This attribute is an IXIASOFT specialization and not part of the LwDITA specification; therefore, it’s not valid for LwDITA content. If you want to use LwDITA content in the IXIASOFT DITA CMS, you must disable @ixia_locid for all content, not just LwDITA content, so that the IXIASOFT DITA CMS does not attempt to add this attribute.

It is possible to integrate @ixia_locid into LwDITA but at that point, the content is no longer strictly LwDITA content. The IXIASOFT LwDITA integration package enables @ixia_locid by default. You can disable it if you want using the instructions in the package.

Map creation. As mentioned earlier, LwDITA maps do not have a <title> element; the title goes in ‹navtitle› within ‹topicmeta›. The DITA CMS now offers the option (as of 4.2.33) to use a parameter ({ixia.title}) to specify where the title of a map should be placed, which allows you to create LwDITA map templates that use this parameter in the element you designate as the “title” element.

If you integrate LwDITA into a DITA CMS version earlier than 4.2.33, you cannot create LwDITA maps within the DITA CMS because those earlier versions allow the map title only in, <title>, <mainbooktitle> or @title. There are two workaround options:

  • Create the basic LwDITA map structure outside of the DITA CMS and import it.
  • Create the map in the DITA CMS as a standard ditamap, then edit it to change its structure and doctype.

A note on Markdown in the DITA CMS

Several customers have asked about using Markdown in the IXIASOFT DITA CMS. The short answer is that we’re looking into it. It’s not a simple matter. The IXIASOFT DITA CMS is designed to accommodate DITA XML content and XHTML content. Markdown is neither. Markdown content does not have the necessary structure that allows the DITA CMS to index it properly.

Our preferred approach is to create an automatic round-trip transformation, whereby users create a topic in Markdown and upon release, the DITA CMS transforms the topic to LwDITA so that the DITA CMS can correctly index it. When a user locks the topic to edit it, the DITA CMS transforms it to Markdown again.

There is a Markdown transform available, cleverly named DITA OT Markdown and found at https://github.com/jelovirt/dita-ot-markdown/. The oXygen XML Editoral so provides this round-tripping by integrating the Markdown transform behind the scenes.

If you’re interested: elements and attributes defined in Lightweight DITA

Superscripted text lists the attributes available on each element, based on the Attribute groups list at the end of this section. This list is correct as of 17 May 2017. The specification is subject to change.

  • topic1 11 (+ id, outputclass)
  • title1 11
  • shortdesc1 11 (+outputclass)
  • prolog11 (+outputclass)
  • body1 11 (+outputclass)
  • section1 2 3 11 (+outputclass)

List elements

  • ul1 2 3 11 (+outputclass)
  • ol1 2 3 11 (+outputclass)
  • li1 2 3 11 (+outputclass)
  • dl1 2 3 11 (+outputclass)
  • dlentry1 2 3 11 (+outputclass)
  • dt1 2 3 11 (+outputclass)
  • dd1 2 3 11 (+outputclass)

Text elements

  • p1 2 3 11 (+outputclass)
  • note1 2 3 11 (+outputclass, type)
  • pre1 2 3 11 (+xml:space, outputclass)

Table elements

  • sthead1 2 3 11 (+outputclass)
  • strow1 2 3 11 (+outputclass)
  • stentry1 2 3 11 (+outputclass)

Image elements

  • fig4 11
  • image1 5 11 (+ href, height, width, outputclass)
  • alt1 5 11 (+outputclass)

Media elements

  • audio2 3 11 (+outputclass)
  • video2 3 11 (+outputclass)
  • param11 (+name, value, outputclass, height, width, iframe)
  • desc1 11 (+outputclass)
  • poster11 (+name, value, outputclass)
  • source11 (+name, value, outputclass)
  • track11 (+name, value, outputclass)
  • fallback1 11 (+outputclass)
  • controls11 (+name, outputclass)

Highlighting elements

Other elements

  • data (topic)5 11 (+ name, value, href, outputclass)
  • data (map)5 (+ name, value)
  • fn1 29 11 (+ id, callout, outputclass, callout)
  • xref1 7 11 (+ href, format, scope, outputclass)
  • ph (topic) 1 511 (+ outputclass)
  • ph (map)1 5
  • specmeta
  • specatt11 (+outputclass)

Map elements

  • map1 (+ id)
  • topicmeta
  • topicref2 3 6 7 8 (+ locktitle)
  • navtitle1

Attribute groups

The groups listed here are defined by the LwDITA specification.

1. localization (@dir, @xml:lang, @translate)
2. filters (@props)
3. reuse (@id, @conref)
4. fig.attributes (display-atts + localization + @outputclass)
5. variable-content (@keyref)
6. reference-content (@href, @format, @scope)
7. variable-links (@keyref)
8. control-variables (@keys)
9. fn-reuse (@conref)
10. display-atts (@scale, @frame, @expanse)
11. spec-atts (@specmodel, @specrole, @importance)