Outgrowing Desktop Publishing
Do you remember what the world was like before desktop publishing? College students had to struggle with typewriters to produce term papers. If you needed to produce a professional-looking resume, you had to take a trip to the local typesetting shop. Business administrators took dictation and used IBM “Selectric” machines to type the resulting documents. The authors of these documents marked up additions and comments with a red pencil.
But revolutionary changes came soon and suddenly. Starting with the invention of the Apple Macintosh and IBM PC in the early 80s, and fueled by innovations like MacWrite and Microsoft Word, it became possible to produce your own documents. When Adobe appeared with PostScript in 1984, it became possible to make these documents look good. In 1985, Aldus PageMaker was introduced-and the new world of desktop publishing had officially begun.
Chuck Geschke, who founded Adobe along with John Warnock, remembers:
“John Sculley, a young fellow at Apple, got three groups together—Aldus, Adobe, and Apple—and out of that came the concept of desktop publishing. Paul Brainerd of Aldus is probably the person who first uttered the phrase. All three companies then took everybody who could tie a tie and speak two sentences in a row and put them on the road, meeting with people in the printing and publishing industry and selling them on this concept. The net result was that it turned around not only the laser printer but, candidly, Apple Computer. It really turned around that whole business.”1
To be sure, desktop publishing was a revolutionary breakthrough in its time, especially when you consider what it replaced. In addition to mere term papers and resumes, it has been extraordinarily useful for producing newsletters, catalogs, brochures, and advertising layouts by offering all of the following advantages and more:
- Shorter production schedules compared to traditional print cycles
- Greater control over the final product
- Unlimited creativity in designing page layouts, color schemes, and font usage
- Flexible style sheets that let you reuse the same content with different print formats
Twenty years later, desktop publishing is still with us. However, the publishing world has changed, and there are now a host of additional needs. While most organizations have made a large investment in desktop publishing and have reaped its benefits, the Internet, political and economic shifts, and a new computer-savvy generation have fundamentally transformed the industry. As a result, many new requirements have been left unmet.
These 21st century needs, which reflect what we will refer to as “enterprise publishing,” are the subject of this article. In particular, we will look at the reasons why today’s organizations are outgrowing desktop publishing and examine how XML provides the key to effective enterprise publishing. Also, since desktop publishing vendors have caused a bit of confusion by adding XML support, we will look more closely at these tools as well.
Desktop Publishing vs. Enterprise Publishing
Desktop publishing is just as it sounds, a way of transforming a previously centralized service into something you can do on your own desktop. Therefore, it has always been about personal productivity and creativity, freeing the individual to create whatever he or she wants, in any form desired.
Enterprise publishing is also just as it sounds, a means of interconnecting functional areas and sharing content across the enterprise. Therefore, enterprise publishing focuses on how content can be reused across different products, outputs, and delivery channels-freeing both individuals and the enterprise from redundant processes and repetitive publication cycles.
Specifically, enterprise publishing focuses on four key needs shared by organizations in nearly every industry:
- The need to decrease authoring and maintenance time. While desktop publishing offers the freedom to produce your own documents, it still requires a separate editorial cycle for each document. This has created multiple, redundant, and time-consuming processes for authors, reviewers, and production people.
- The need to extend content sharing and reuse. Although style sheets make it possible to reuse the same content with new formats, they do not facilitate reuse across products and publications. This means that the same content gets maintained in many places and is very difficult to keep in sync.
- The need to automate multichannel publishing. While desktop publishing tools focus on print, most organizations are feeling the need to break out of a print-centric paradigm and, instead, focus on timely and flexible online delivery.
- The need to publish customized information on demand. Although desktop publishing tools offer unlimited creativity for producing individual documents, they do not support the automatic creation of customized documents.
Figure 1 shows the typical situation in most organizations today. While desktop publishing has been effective at creating individual documents, each delivery channel is supported by a different desktop publishing tool. Even though there is much overlapping content, content is shared via cut-and-paste, re-keying, or not at all. The result is multiple, independent cycles for authoring, reviewing, and publication. This not only ties up a tremendous amount of time and effort but also makes it nearly impossible to keep information up-to-date and in sync between documents and delivery channels.
Enterprise publishing offers a completely different approach. Rather than using a different tool and process for each type of content, it uses a centralized approach in which all output is derived from a “single source.” As shown in Figure 2, enterprise publishing rests on seven specific approaches:
- Replacing proprietary desktop publishing formats with XML. Unlike the file formats used by individual publishing tools, XML is format-, product-, and output-neutral. The same XML file can be readily used to produce different outputs across multiple delivery channels, using whatever format specifications are appropriate.
- Breaking up content into smaller, reusable building blocks. To maximize the ability to reuse content, content is broken up into smaller building blocks.2 These building blocks, which typically represent specific topics and sub-topics, can be “mixed and matched” into a variety of different output publications.
- Maintaining only one copy of each logical building block. Usually called “single sourcing,” this refers to the idea of collapsing all the currently redundant versions of content into a single, authoritative source. Then, when a change is made to the single-sourced content, it will be automatically reflected in all the outputs that use this building block.
- Assigning the appropriate authors and SMEs to maintain each building block. Unlike desktop publishing, where an author is typically responsible for an entire publication, enterprise publishing assigns the appropriate person and functional area to each single-source building block. This creates an efficient collaborative environment where each area maintains the information it knows best.
- Optimizing editorial, review/approval, and publishing workflows. Using a content management system and related electronic workflows, previously redundant processes can be streamlined and combined.
- Using XML to assemble building blocks into output publications. While desktop publishing deals with one document at a time, enterprise publishing leverages XML’s ability to assemble smaller building blocks into higher-level “master documents.” These higher-level XML documents represent a specific use of the content (e.g., installation information for a particular product) but are still readily transformed into multiple formats (e.g., both web and print outputs).
- Using a multichannel publishing engine to produce final output. Desktop publishing is focused on generating printed output and uses a WYSIWYG (“What You See Is What You Get”) interface to show the author exactly how the print will look. In enterprise publishing, the publishing engine must handle not only print formats but also simultaneous publication to web pages, Help systems, and CD and DVD formats.
Symptoms of Outgrowing Desktop Publishing
How can you tell if you are outgrowing desktop publishing? Speaking metaphorically, the symptoms include:
- Chronic impatience. You are probably feeling an intense desire to push information to the web, without having to wait for the print product. Unfortunately, your current processes do not let you get to the web until the print version is approved. This may be causing customer dissatisfaction or even the inability to effectively compete in a web-based publishing world.
- Sluggish heartbeat. Your organization may be slowed down by perpetually long cycle times and difficulty in meeting deadlines, which is usually associated with too much time and effort spent on editing and reviewing. This is probably the biggest source of high costs and low productivity.
- Excess weight. This problem may manifest itself in several ways: either in the form of large, unwieldy documents that aren’t optimized for a specific audience or as the burden of maintaining too many individual documents optimized for specific audiences. In the first case, customer satisfaction is probably suffering; in the second, costs and workload may be skyrocketing.
- Split personality. Your organization probably has the problem of different documents saying different things or the web perpetually being out of sync with print. If so, it is most likely associated with many variations of the same content that are all edited and published independently. This may be causing customers to doubt not only the credibility of your documentation, but also your products. If you’re a commercial publisher, this directly affects the quality and credibility of your products.
So what do you do if you recognize these conditions in your organization? Well, it is human nature to treat the symptoms first, rather than deal directly with root causes. Therefore, it would not be surprising if your first reaction is to employ one or more of the following tactics:
- Either add more people or increase their work hours or both. While this can be a temporary help, it is not a scalable solution and will tend to escalate into an increasingly impossible workload.
- Outsource editorial and production functions to reduce cost. While outsourcing can potentially reduce the cost per hour, it makes redundant content even more isolated in “silos.” This reduces consistency, often increases the total hours spent, and may make it even harder to meet deadlines.
- Upgrade to new versions of the tools. Solving these issues usually requires a paradigm shift rather than incremental changes. An unscalable architecture can’t be made scalable through upgrades alone.
- Increase the level of discipline used with current tools. Realizing the importance of structure, some organizations have mandated a more standardized use of style guides and style sheets. This may help with cut-and-paste and cross-referencing but doesn’t change the fact that documents are still produced one at a time.
Unfortunately, these tactics are not likely to work. But why is it that desktop publishing can’t scale to handle the needs of enterprise publishing? We will explore this more deeply in the next section.
Why Desktop Publishing Will Not Scale to the Enterprise
Before we tackle this question, we need to state a very important point: there is nothing wrong with desktop publishing. It does what it was designed to do, and does it very well, even after twenty years.
The problem, simply stated, is that it was not designed to support enterprise publishing.
Let’s say that again in another way. Like any technology, desktop publishing was designed with a specific viewpoint in mind. The problem is not that this viewpoint is wrong, but rather that it is a different viewpoint than needed for enterprise publishing.
Desktop publishing’s point of view
So what was desktop publishing designed for? And why is this inconsistent with enterprise publishing? To answer this, you have to go back to the original thinking behind the desktop publishing paradigm. We can summarize this in five points:
- Desktop publishing was designed specifically for visual impact. Desktop publishing popularized the term WYSIWYG. By design, what matters is what the content looks like, not how it is structured underneath. In fact, you shouldn’t have to worry about how it’s structured underneath.
- Desktop publishing was designed for personal productivity. Growing up in the age of PCs and Macs, desktop publishing was designed to free the user from mainframe computing and centralized control. It was explicitly not designed for centralized operation and enterprise scalability.
- Desktop publishing was designed for personal creativity. The good news is that desktop publishing lets you do anything you want. The bad news is that it doesn’t stop you from doing anything you want. Desktop publishing is inherently unstructured-on purpose. While style sheets can be used to promote consistency, they are not mandatory. If clicking the “bold” and “font” buttons is easier than clicking the style sheet buttons, that’s fine so long as the content looks right.
- Desktop publishing was designed to produce one document at a time. The WYSIWYG interface works because the page formatting engine is built right into the authoring tool. As you add content, the formatting engine not only determines fonts and spacing, but also dynamically performs hyphenation, cross-referencing, and page numbering. This only makes sense if you’re working with the entire document at one time, not with an individual building block.
- Desktop publishing was designed specifically for print output. The WYSIWYG view is oriented to print, where hyphenation, margins, page breaks, and page numbering make sense. The WYSIWYG paradigm also forces you to focus on a particular print format. While a new style sheet can be applied to the same content, the very nature of WYSIWYG implies that there is one primary (print) format.
How enterprise publishing is different
To be fair, desktop publishing has evolved since 1985 and can handle a bit of what enterprise publishing requires. For example, style sheets can allow reuse across different print formats. In many systems, files can be split into chapters that can be recombined into books. Some even have the ability to use “conditional text” or “hidden text” to handle simple variations based on audience.
But enterprise publishing requires much more than this:
- Content must be in a fully neutral format. This is necessary so that everyone in the enterprise can access it, and the content can be freely reused and recombined. XML was specifically designed for this purpose. This will not work with desktop publishing formats because each tool is designed differently, and their file formats reflect these differences. Therefore, there is no fully reliable, automatic means to convert files between tools.
- Content must be split into building blocks at appropriate size. To support a flexible and scalable enterprise publishing environment, building blocks must be allowed at any level of granularity. In some cases, chapter-level building blocks make sense. In other cases, a single paragraph (e.g., a reusable “caution”) may be appropriate as a building block. Even when an entire chapter is reused, it must be possible to change the chapter title and introductory text to fit the context.
- Building blocks must be reusable across different delivery channels. Building blocks must be flexible enough to be reused across very different output types, typically including multiple printed documents and multiple web pages, and sometimes including Help and CD/DVD formats.
- Building blocks must be reusable in different contexts. In an enterprise publishing solution, not all content can be directly reused. In many cases, the content is almost the same but has variations based on intended audience. These variations are handled in XML through a technique known as “profiling” in which certain XML elements are marked with special “applicability” attributes. In this case, profiling allows the same building block to be used in multiple installation manuals. When published, the common text is output as usual, but the profiled text is filtered so that only the applicable content appears in each context. Real-life examples may simultaneously profile on product, region, user level, technical specifications, and many other dimensions.
- Building blocks must be reusable at different levels. If content is to be freely reused in different publications, it cannot be assumed to occur at the same level. For example, a chapter in one publication might be relegated to a sub-topic in another, depending on the emphasis and level of coverage of each usage. This kind of transformation is exactly what XML was designed to handle but is not how desktop publishing works.
Adding XML to Desktop Publishing
Suppose you are convinced that XML is critical to enterprise publishing, but you would really like to stay with your existing desktop publishing tools. Then the question is whether it is possible to get the “best of both worlds” by using a desktop publishing tool that supports XML. In fact, many of the most popular desktop publishing tools (e.g., Microsoft Word, FrameMaker, QuarkXPress, and Adobe InDesign) now offer some level of XML support.
The problem here is one of understanding how such XML support actually works, and then examining what is and is not possible within those design constraints. Let’s start by seeing what “XML support” means in a desktop publishing context.
How desktop publishing supports XML
The first thing to understand is that no desktop publishing system actually supports XML internally; only a native XML editor can provide that level of support.
Instead, they support a mapping between XML and internal proprietary structures, typically to simple paragraph and character style names and to standard constructs like tables, figures, footnotes, and cross-references. XML support consists of importing XML into the internal proprietary format using a set of import mapping rules and then exporting the internal format back into XML with similar methods.
The best of these implementations also provide some real-time validation of the XML structure while editing to discourage the author from violating the structural rules set by the XML DTD or schema. However, no desktop publishing system directly uses a DTD or schema to do this; instead they transform the DTD/schema rules into a proprietary framework that controls the way internal components can be used.
How XML is mapped to desktop publishing components
The second point to understand is that XML mapping is actually a very complex problem and is subject to error.
The idea is to make a one-to-one mapping of XML elements to corresponding desktop publishing style elements. For example, an XML element like “Section” can be mapped to a style sheet element like “Head2.” Every time we have a “Head2,” it’s the same as a “Section” in XML. Where XML elements occur within the text of a style sheet element, then we can map to a field or character style inside the style sheet element.
Of course, it is not actually this simple. When we say that a “Section” maps to a “Head2,” we really mean that a Head2 indicates the beginning of a new section, and that the Head2 content is the section’s “Title.” We also have to ask what happens if we’re already in a previous Section; in XML we need to explicitly end the previous Section before starting the next. And the rules are not even that simple. For example, what if this is the first Section? Then we would need to start a Chapter before we can start the Section.
Desktop publishing vendors have found ways to optimize this kind of logic, but there is no way to avoid its inherent complexity and tendency for error. In reality, mapping between XML and style elements requires a careful matching between the capabilities and limitations of the desktop publishing environment and the intentions of the XML structure. The more the XML structure matches the desktop publishing structure, the better the mapping will work. However, the more you change the XML to match the desktop publishing structure, the more limitations you put on the way the document can be processed.
Figure 3 is intended to give you an appreciation for the kind of complexity involved by illustrating just one of potentially hundreds of rules that might be required.
What if the vendor has its own XML standard?
If a desktop publishing tool offers its own “internal” XML standard (such as Microsoft’s WordML), then it is important to understand that this is actually a direct expression of internal proprietary formats exported in the form of XML tags. Since XML can be used to describe virtually any data format, this is not hard to do. But mapping this to other XML applications still requires the equivalent of import/export rules.
While a “Head2” might be easier to manipulate in XML form (“<Head2>”), it’s really no different than the original style element within the desktop publishing system. In other words, mapping a “<Head2>” to a “<Section>” is no different than mapping a “Head2” to a “<Section>.”
Why This Approach Does Not Work
Just as desktop publishing was not designed to support enterprise publishing, it was also not designed to support XML-based publishing. Again, the reasons for this can be traced to basic architectural differences:
- Desktop publishing components are visual (e.g., component styles and character styles) whereas XML thinks in terms of information objects (e.g., abstracts, descriptions, part numbers, and prices).
- Desktop publishing components are sequential (e.g., “head1” followed by a “head2”) whereas XML thinks in terms of nested objects (e.g., a “section” within a “chapter”).
- Desktop publishing components have built-in limitations based on the visual formatting engine (e.g., no tables allowed within table cells) whereas XML supports any structure allowed by a DTD or schema.
- Desktop publishing is designed for self-contained documents whereas XML is designed to flexibly link smaller objects into larger objects.
As we saw in the previous section, desktop publishing applications use a mapping approach to support XML. If this approach is to work, four key assumptions must be met:
- It must be possible to establish a complete, one-to-one mapping between the XML and internal desktop publishing components.
- No XML information can be lost during XML import.
- No authoring information can be lost during XML export.
- Authors must not be able to corrupt the XML mapping relationships while editing.
This section examines each of these assumptions in detail, and shows why they don’t actually work in practice.
Assumption: It is possible to define a one-to-one mapping
While this assumption might be met in very simple applications, in most cases it works only if both the DTD and style sheet can be adjusted to force alignment. Making these kinds of adjustments can be problematic when the DTD and/or style sheets have been standardized beyond a particular application. Also, if a variety of desktop publishing applications are involved, a different type of adjustment will probably be required for each one. Even then, there may be some things that simply can’t be mapped.
What kind of adjustments are we talking about? First, the DTD may contain heavily nested objects that go beyond the capabilities of the desktop publishing system to track. But what if our XML allowed a part number inside a bolded phrase inside a table cell? If the desktop publishing tool doesn’t allow nested character styles inside a table cell, which is quite likely, then we would have to change the DTD and give up on tracking this level of information.
It is also possible that the order of tags in the DTD may be different than the way they lay out in the style sheet. For example, copyright information may be tagged at the beginning of the document but printed on the back page. This kind of relationship may not be possible to map, so the DTD might have to be changed to place copyright elements at the end, or the style sheet might have to be changed to place copyright notices at the front.
In another example, the DTD may require the figure title to be stored in an attribute of the containing “Figure” tag, but the title text might actually be placed below the image in the document. This kind of relationship may not be possible to map without either changing the DTD or by placing figure titles above rather than below the images.
Assumption: XML information will not be lost on import
While the major information components will not be lost on import, many of the more subtle XML features can be. This is true because mapping is done to the desktop publishing style sheet elements, not on a character-to-character basis. For example:
Assumption: Authored content will not be lost on export
When authors add new content or change formatting information, it is assumed that this information will be properly exported using the XML mapping rules. However, most desktop publishing systems do not prevent you from adding things not covered by these rules. Why not? The reason, as we saw in the previous section, is that these rules are only loosely integrated with the internal desktop publishing engine.
Therefore, it’s possible to put in extra formatting that will get lost when you save as XML. You can also create new style elements without updating the mapping rules, which means the new content may not get exported to XML or may be exported improperly.
By trying to match visual WYSIWYG formats to XML information elements, authors can also easily get confused as to what’s being mapped. For example, they may know that a “Head2” is always used to start a Section. But what if they make a “Normal” style look like a “Head2” by clicking the bold and font buttons? They may never realize that their element is not a “Head2.” Remember, what matters in desktop publishing is what the content looks like, not how it is structured underneath.
Assumption: Authors will not corrupt the XML while editing
Even when a desktop publishing system attempts to control the use of styles while editing, it is usually possible to break the mapping relationships, both purposely and without realizing it. When this happens, these issues will not be caught until XML is being created for export, at which point the document is declared in error, and the author is forced to debug the content using “XML parser” error messages.
Why can’t the desktop tool prevent these errors from happening when an XML editor can? The answer goes back to the basic XML mapping approach. While an XML editor directly tracks XML elements against the rules in an XML DTD or schema, the desktop publishing tool can only track the use of style elements. The relationship between style elements and XML elements is actually quite complex enough to make it virtually impossible to track all the implications. This means that even a well-trained, well-intentioned author can make mistakes, and the system simply can’t be smart enough to catch all of them.
So, are you feeling impatient to get information to the web more quickly? Is your organizational heartbeat running slow due to long publication cycles? Does your redundant document content need to lose weight? Or is it developing an increasingly split personality, saying different things on the web than it does on paper?
If so, you are exhibiting the symptoms of outgrowing desktop publishing.
In this article we have introduced the concept of enterprise publishing as the cure for these symptoms. Enterprise publishing focuses on four key needs shared by organizations across every industry segment:
- The need to decrease authoring and maintenance time.
- The need to extend content sharing and reuse.
- The need to automate multichannel publishing.
- The need to publish customized information on demand.
Desktop publishing was not designed to meet these needs and was not designed to scale to the enterprise-even when “XML support” has been added. Only XML-based enterprise publishing provides the right kind of scalable and effective solution.
Enterprise publishing is a new way of looking at things. Rather than publishing documents independently in parallel processes, it uses a centralized approach in which all output is derived from a single source. Rather than focusing on WYSIWYG views of the printed page, it focuses on shared information components that can be flexibly recombined and delivered across multiple business units and multiple delivery channels.
Without doubt, desktop publishing has had long-standing and well-deserved popularity. However, the publishing world has become a very different place since 1985. Today’s businesses are outgrowing the personal creativity advantages of desktop publishing and are focused, instead, on leveraging enterprise publishing to extend their competitive reach, control their cost structures, and achieve an unprecedented level of flexibility.
1. Quoted by The Computer History Museum on www.computerhistory.org.
2. I will sometimes call these building blocks “information objects” or simply “objects.” But if you want to sound like a seasoned professional, you should refer to them as “chunks.”