August 1999

Generation X … ML


By now, most of us have at least a cursory understanding of Extensible Markup Language (XML). But do we really know what XML can do for us as technical communicators? Jon Bosak and Tim Bray, both important contributors to the development of XML, give us a good view of what XML can do for us in “XML and the Second-Generation Web” (Scientific American, May 1999).

Currently, most Web publishing is done using HTML. Like all markup languages, HTML is made up of a number of tags. Because of the limited tag set available in HTML, it is very easy to use. However, all of the HTML tags are used for formatting. Thus, when you download an HTML published Web page onto your computer, no functional information is available to you-only text. Any interaction with the information must be through a distant server.

For example, to buy something from a Web catalog site, you must first download the pages you need to make your selection. Then you fill out the information on the necessary forms, which are transmitted to the server. Then you download more pages to confirm your selection. To make a small change to your order you must go through an entirely new upload/download cycle. The same information may be transmitted to you many times, thus making the transaction agonizingly slow.

XML uses tags to label the type of information rather than the format of the information. Instead of labeling the parts of an order as boldface, paragraph, row, and column as in HTML, XML labels information as price, size, quantity, color, and so on. A program local to the buyer’s computer can then organize this information without further processing by the server. Because the type of information-not the format-is defined, the information can be formatted in a meaningful way differently by different computers.

Bosak and Bray give several examples of XML advantages. Currently, to make an airline reservation from an HTML scripted site you must communicate with the server to narrow down your choices. If you ask for a flight from London to New York, you may get a long list of choices. If you want to narrow that down in terms of departure or price, you must send a request to the server which must download an entire selection again. With XML, all the data about the flights can be downloaded the first time as flight data (not just text) along with a program to manipulate the flight information. All subsequent selections can be made without further involving the server.

With XML it is possible to define tags for just about any kind of data in a DTD (Document Type Definition) and create a new markup language. However, there is a price to pay for this power. Just as a browser is necessary to interpret the HTML tags to format your Web page, XML must have accompanying software to interpret its tags. Additionally, a stylesheet must be provided for each format in which XML is displayed. A benefit of XML is that for a given markup language, multiple stylesheets can display the same data in multiple formats. It’s possible, for example, to have a stylesheet for a conventional computer, a handheld display, and even an audible speech stylesheet for the blind.

The authors point out that XML makes computers and the Web much easier to use but that it is much more demanding than HTML to create. They conclude with a challenge to technical communicators:

Tomorrow’s Web designers will need to be versed not just in the production of words and graphics but also in the construction of multilayered, interdependent systems of DTDs, data trees, hyperlink structures, metadata, and stylesheets-a more robust infrastructure for the Web’s second generation. CIDMIconNewsletter