Designing Adaptive Documentation with XML: From Formal to Rhetorical Markup

Home/Publications/Best Practices Newsletter/2003 – Best Practices Newsletter/Designing Adaptive Documentation with XML: From Formal to Rhetorical Markup


June 2003

Designing Adaptive Documentation with XML: From Formal to Rhetorical Markup

CIDMIconNewsletter Lars Johnsen, Associate Professor, University of Southern Denmark


One of the goals of single sourcing is to design and develop what we may call adaptive documentation, technical product information that may be rendered in different modalities (text, interactive graphics, and sound) and in different media (print, computer screens, and small displays on handheld devices) and may be assembled like Lego blocks in virtual networks according to users’ needs and wishes.

The question, of course, is how do we design technical content that is malleable enough, as it were, to serve all these purposes?

Part of the answer, no doubt, lies in adopting more modular, or object-oriented, approaches to information design as well as introducing media-neutral formats, such as XML, into the information-development process. But object-oriented approaches and new standards like XML are not enough.

We also need new ways of enriching content with information about its underlying communicative, or rhetorical, strategies and priorities. If, for instance, a document is to be automatically reduced so that it may be shown on a mobile phone, it is crucial that condensing the content should take place in such a way that the central message of the text is rendered while less significant details are suppressed. But condensing the content can be done only if the key contents of the document have been identified and encoded in advance.

Rhetorical signalling also seems to be needed in cases where users are expected to navigate effectively in vast hypermedia spaces. To create coherence in such environments, users need access to information about how individual content units are related to each other conceptually and rhetorically.

What I would like to do in this article is to discuss rhetorical markup as one possible strategy for enriching technical content1. More specifically, I would like to define rhetorical markup in text-theoretical terms and to briefly examine how communicative intentions and structures might be modelled and represented using existing XML standards.


Document markup may be defined as inserting code into a document to give relevant information about that document. In XML, markup comes in two flavours: elements and attributes. Elements constitute the most fundamental way of identifying and labelling text segments, while attributes are additional categories that may be attached to elements. Attributes are filled with specific values.

For instance

<para author=”Smith” color=”red”>This is a red paragraph written by Smith.</para>

Normally, markup in technical documentation has one of the following functions:

  • structural markup (identifying sections in the document and possibly their relations)
  • presentational markup (indicating how the document is going to be rendered)
  • semantic markup (encoding important domain objects and concepts)
  • metatextual markup (giving information about the document as an information resource, for instance, author, date of publication, key words, and so on)

These functions are broad categories and the boundaries between them are fluent. Accordingly, it is not always possible to unambiguously categorize markup in a given document. For instance, is the element <contact.person> in a <press.release> a form of structural, semantic, or even metatextual markup? And is a <list> a structural element or in fact a presentational one?

As a first attempt to define rhetorical markup, we may say that it is a kind of structural markup-the function of which is to represent a document’s communicative structure. That is to say, an explicit identification of the author’s intentions as they are realized in the various parts of the document. But what is communicative structure, then?

Formal and Schematic Structure

The idea of communicative structure is an important one in text theory (genre theory, discourse analysis, and so on). In systemic functional linguistics, for example, a distinction is made between two types of text structure: formal and schematic (see Eggins 1994). The formal structure of a document is the way in which it is divided into units, such as chapters, sections, and paragraphs, while its schematic structure reflects its functional organization, that is, the configuration of semantic units through which the author seeks to achieve his or her communicative goals (for example, introduction, background, argumentation, and conclusion). The manifestation of the schematic structure is done through language and is, in principle, independent of the formal structure, although there will be a one-to-one mapping of the two in well-designed documents.

Schematic structure is closely bound up with the concept of genre or text type. Genres or subgenres may be described as sets of documents sharing similar schematic text structures. A genre is said to have a generic structure potential (GSP) that defines the range of valid schematic elements and the patterns in which these elements are permitted to occur.

On the basis of these two concepts, we may initially define two types of structural markup:

  • schematic markup (the representation of a document’s semantic components as defined by its genre)
  • formal markup (the representation of a document’s concrete building materials-text, graphics, video, and so on)

Thus, a screen dump in a procedure in an installation guide, say, may either be marked up as a <figure> (formal markup) or as a <result-of-user-action> (schematic markup) or both. A text segment in a business proposal may be identified as a <paragraph> or a <conclusion> or both.

Formal and Schematic Markup

Typical XML document representation languages like XHTML and DocBook are, not surprisingly, well suited to describe formal document structures.

DocBook is a book-oriented markup language and contains a multitude of elements to describe the mortar and bricks of (technical) publications: <book>, <chapter>, <section>, <para>, <title>, <footnote>, <figure>, <mediaobject>, and so on.

In terms of schematic structure, DocBook offers some elements for marking up genre-typical meaning components. An author may insert a <dedication> in a book, an <abstract> in an article and a <warning> and a <step> in a procedure. DocBook’s syntax is relatively loose, though. The language also allows markup patterns that seem to violate genre conventions. For example, an author is free to insert an <abstract> in a <procedure>.

XHTML, the successor to HTML, is primarily a Web language and, therefore, does not have book-oriented formal elements but a host of others: <head>, <body>, <div>, <p>, and headings on six levels <h1> to <h6>.

In XHTML, elements for schematic markup are almost non-existent, which does not mean, however, that text segments cannot be assigned schematic roles. In XHTML, a

[class] attribute can be attached to most elements to categorize them. And nothing prevents us from employing this attribute to systematically mark up schematic structure. Consider the following example of a typical text type in technical communication-a procedure:

Copying text

  1. Select the text you would like to copy.
  2. Choose Copy from the Edit menu.
  3. The text is now copied onto the clipboard.
  4. You can also press Ctrl-C to copy text onto the clipboard.

This example might be coded in the following way in XHTML:

<div class=”procedure”>
<h3 class=”introduction”>Copying text</h3>
<li class=”step”>
<p class=”action”> Select the text you would like to copy.</p>
<li class=”step”>
<p class=”action”>Choose Copy from the Edit menu.</p>
<p class=”result”>The text is now copied onto the clipboard</p>
<p class=”tip”>You can also press Ctrl-C to copy text onto the clipboard</p>

Here the procedure is wrapped in a division element containing a heading <h3> and an ordered list <ol>. The ordered list contains list items <li> consisting of paragraphs <p>. All elements have a class attribute whose value reflects its schematic role.

In general, working with elements as well as attributes creates good opportunities for enriching technical content because it allows us to assign communicative roles not only to text but also to most kinds of multimedia objects. One problem with XHTML’s [class] attribute, however, is the lack of constraints on its possible values. That is to say, there are no limits to what authors may put in a [class] attribute. In the procedure above, one author might choose “result” to describe the consequences of user actions while another might opt for “system feedback,” which, of course, may result in inconsistent markup and render it partially or completely useless in subsequent processing.

Toward Rhetorical Markup

To encode a document’s genre-defined semantic structure is intuitively appealing and often makes a lot of sense in relation to typical document processing, such as rendering. But markup based on schematic structure alone does have its limitations and makes some forms of manipulation difficult.

One problem is the lack of explicit text relations in this markup model. Individual text elements are assigned specific communicative roles, but the markup does not specify exactly how the individual elements are related. In the procedure above, the second step appears to consist of an action and a result, but there is nothing in the markup as such to suggest that the result is actually a direct consequence of the action.

Another problem concerns the relative weight of document constituents. In a schematic or formal structure, all elements are communicatively equal, so to speak, despite the fact that documents normally contain central and more peripheral information. Again, if we use the previous procedure as an example, you may argue that <action> elements are more communicatively pertinent than <result> elements, which is demonstrated by the fact that “results” may be deleted without the text becoming incoherent, which is not the case with elements of the “action” type.

The third problem, in fact related to the first, is that the model applies to documents with sequential, fairly stable semantic structures like procedures and press releases. The model does not apply so well to

  • broadly defined document types, such as reports
  • non-sequential text, such as hypertext
  • documents created according to modular writing methods
  • virtual text, that is, compound documents assembled on-the-fly from databases

To address these problems, we need to adopt a more relational approach to communication structure, for example Rhetorical Structure Theory or RST (Mann & Thomson 1988).

One of the main tenets of RST is that running text often consists of hierarchically organized structures in which individual constituents are related to each other functionally. Constituents in a text structure may be either nuclei, text elements with a prominent communicative role, or satellites, text constituents containing supportive or supplementary information. Constituents are connected by relations manifesting a certain communicative function. These relations may be either explicit or unsignalled. Consider the examples:

PowerProduct 2.0 may now be purchased from our outlets. With PowerProduct 2.0, you can save your files in XML, SGML, or HTML.

Our new Web site is a tremendous success. It has been visited more than 5000 times.

Both these examples may be analyzed as a text structure consisting of a nuclear constituent (the first sentence) followed by a satellite. In the first example, the communicative function of the satellite is to motivate the reader to realize the action presented in the core constituent-to buy the product in question. In the second example, the satellite’s main function is to provide evidence for the nuclear information-the Web site is a success.

Text segments may play different rhetorical roles in different communicative contexts. Consider a third example:

Our new release enables you to save files in several formats. With PowerProduct 2.0, you can save your files in XML, SGML, or HTML.

Here, the second sentence is the same as in the first example and carries the same rhetorical role, namely that of a satellite, but its primary objective no longer seems to motivate but rather to elaborate on what has been stated in the nucleus.

On the basis of empirical analyses of a number of authentic texts representing various genres, more than 20 different rhetorical relations have been identified and defined in RST. But although the inventory of relations in RST is empirically founded, the class of relations is, in principle, open and extendable. Table 1 presents a sample of RST relations and the characteristics of associated nuclei and satellites (taken from Mann 1999):


Using RST, we may now introduce a third type of structural markup in addition to the two already mentioned above:

rhetorical markup: the representation of a document’s nuclear and satellite constituents and their relations

Rhetorical markup resembles schematic markup in that it fundamentally describes the way authors structure their intentions in text. But while schematic markup identifies genre-based meaning structures, rhetorical markup may be employed to signify communicative relations anywhere, even across document boundaries.

Rhetorical markup may prove useful in several ways. For example, it will allow us to specify how physically dispersed content items are communicatively related. Such information might be of value to users navigating in vast hypertexts or virtual spaces. It will also facilitate information presentation. Marking up what is rhetorically more or less pertinent in a text makes it possible to condense it to varying degrees. In fact, a rhetorically marked up document may be conceived of, and used, as a kind of folding structure whose parts may be successively folded in and out depending on who the reader is, what his or her information needs are, and where the document is to be rendered. On a mobile phone, condensation might result in all satellites being suppressed and only nuclear statements being shown.

Rhetorical Markup: But How?

In document representation languages, such as XHTML and DocBook, there is no way of directly representing rhetorical relations. It may be done indirectly, though, using elements for linking, such as cross-references or hypertext links. In DocBook, there are several elements of this type, notably <link> and <ulink>, and in XHTML, we find the famous anchor (<a>). The reason such elements can do the job is once again that they can be filled with an attribute value indicating the rhetorical relation that they realize. In XHTML, the ubiquitous [class] attribute can be used for this purpose whereas DocBook has a [type] attribute reserved for linking elements.

A perhaps less obvious, but potentially powerful, method of encoding rhetorical structure is to employ what we might call superimposed markup, that is, markup pointing to, but independent of, information resources (text segments, files, graphics, video clips, and so on).

Creating an overlay of rhetorical metadata on technical content may be done using XML Topic Maps (XTM). XTM is a language for creating maps of knowledge structures in and among digital resources. The central building block in a topic map is the topic. Topics are objects that we for some reason want to store information about. Topics might be anything that can be conceptualized: things, concepts, persons, events, and so on. In technical documentation, topics could be products, spare parts, users, and publications. When an object is made into a topic it is reified, made into an addressable computer-based entity, which may be processed in some way.

Topics may be given names and may be grouped into classes. Moreover, a topic may have occurrences, resources that are pertinent to this topic. Normally, these occurrences will be pointers to files or Web addresses. Occurrences may also be classified. Last but not least, topics may be joined by so-called associations and the roles of individual topics in these associations may be specified. Names, occurrences, and associations may be given scopes, contexts in which they are valid.

Now, how can we apply XTM to signal rhetorical relations between (disparate) information resources? One answer is to reify resources and connect them using typed associations. An example:

<topic id=”date-is-not-set”>
<topicRef xlink:href=”#error-message”/>
<resourceRef xlink:href=”datenotset.xml”/>
<topic id=”how-to-set-date”>
<topicRef xlink:href=”#procedure”/>
<resourceRef xlink:href=”setdate.xml”/>
<association id=”solution-to-date-not-set-problem”>
<topicRef xlink:href=”#solutionhood”/>
<topicRef xlink:href=”#nucleus”/>
<topicRef xlink:href=”#how-to-set-date”/>
<topicRef xlink:href=”#satellite”/>
<topicRef xlink:href=”#date-is-not-set”/>

This bit of code can be interpreted as follows: the first topic “date-is-not-set” is a kind of error message whose subject is the external XML file “datenotset.xml.” The second topic “how-to-set-date” belongs to the class of procedures and reifies the XML file “setdate.xml.” The association that follows connects the two topics in a “solutionhood” structure and defines their roles in the structure (nucleus and satellite respectively).

Concluding Remarks

Although XTM may seem “verbose” at first sight, I think it is worth considering for a number of reasons:

We are able to specify rhetorical relations between all kinds of information resources, not only those that are marked up with XML. For instance, we can describe how the content of a graphics file in GIF format is rhetorically related to a given paragraph or even another picture.

Information resources may have several rhetorical functions at the same time. A screen dump showing, say, a menu may in one hypertext network support a verbal description of a user interface and in another visualize the result of a user action in a procedure.

Content, or native, markup becomes less important. Authors can use the XML language they are most comfortable with because rhetorical information is superimposed and not part of the inherent markup as such.

Rhetorical markup with XTM supports a modular or object-oriented approach to information development. Reusable information objects may be designed as standalone units and may subsequently be linked together in explicit coherent structures, which in turn clearly suggests a division of labour between authors and information architects: authors write and mark up information modules while information architects build publications. CIDMIconNewsletter

About the Author

June BP13

1 The term “rhetorical markup” was originally coined by Hendry (1995).