Considerations for On-Demand, Web-Based Publishing
Like many industries, web-based publishing is beginning to creep into the area of technical publications. When it is the right fit, it carries the potential of significant and conspicuous benefits.
Defining “On-Demand” Web-based Publishing
Perhaps the best way to begin a definition of on-demand web-based publishing is to first clarify what it is not. It is not the traditional model where published material takes the form of static files accessed locally from a file system, such as PDFs, compressed/compiled help files, Flash videos, and so on. Still dominant in the industry, static systems such as these are typically characterized by a single user’s interaction with a local information source whose content does not change following its generation.
A web-based system is quite different, represented by a single, centralized information source that is accessible to many users over a network. The technical content lives on a networked web server, where it is managed by software that listens for requests and distributes information as appropriate. Typically, this type of system is based on TCP/IP exchanges, where a consumer uses a web browser to initiate requests and render the responses. The full publication process is only realized once the content appears in the browser, hence the concept of “on-demand” publication as shown in Figure 1.
Figure 1: General Web-Based Publishing Diagram
If this architecture reminds you of the internet, then you are on the right track. The underlying technology is hardly new. However, aside from placing existing static content on public websites, the use of this technology in the technical communication field is still somewhat new. To address curiosities related to this growing field, I present considerations for taking advantage of web-based technologies, with a special focus on the type of information typically delivered by technical communicators.
Before proceeding, it is important to note that a web-based architecture is not always appropriate. Static systems have their place and likely will remain relevant for the long term. Where web-based publishing is applicable, the benefits can be enormous and immediately recognizable. Where it is not, it could be a very costly mistake.
When discussing benefits, it is useful to group them into two categories: consumer access and publication flexibility. In this discussion, I also provide various possibilities of scope as related to the need for and the implementation of web-based publishing.
Access—More of It and Better Control
When you are handing out content from a web server, the obvious prerequisite is that any consumer must be connected to the network. The network could be anything from a small local area network (LAN) up to the public internet, depending upon who and where the target audience is. Once that requirement is met, though, the possibilities of enhancing content access are virtually limitless. Consider the following:
- Easy access. If you stay with mainstream technologies such as TCP/IP, HTTP, and HTML for the exchange of data, any computer with a web browser becomes a candidate for access. Today that effectively means any computer. Depending on the complexity of your system, you may find that browser compatibilities require consideration; however, the cost of any such considerations may be vastly outweighed by the improvement in access. Imagine… you turn on your system, send out a link, and everyone can immediately read your documentation. No one has to download or install anything. In some situations, that benefit alone may be worth the effort.
- Secure access. When you control the gateway to the information (the web server application in this case), you control who gets in and who sees what. While a technical discussion of authentication methods is beyond the scope of this article, it should be easy to understand that it is much more difficult and complex to control user access with a distributed file-based format, such as a PDF or a local help system.
- Interactive access. Somewhat related to secure access, you can tailor the information that is distributed based on access requests. For example, perhaps a user only wants to see the technical information on Widget A which is interspersed throughout your 20-widget manual. A web-based system is much more amenable to satisfying this type of request than a static system.
Flexibility and Accountability—Stretch Your Possibilities
For a static system, your flexibility ends once you hand content to the consumer. After that, you can’t change anything, other than to deliver a new publication. With a web-based system, however, your flexibility is ongoing. Consider the following possible advantages:
- Instant publishing of updates. If you set up your system to draw information from an active, centralized content source, your consumers will see changes as soon as you update that source. For example, if you design a system where authors work with an XML source and the web server generates HTML pages from those files on demand, you need only update the XML and all subsequent pages will reflect those updates. And, if you want older information to remain available, that also becomes your choice. The web server will distribute what you tell it to—nothing more, nothing less.
- Dynamic processing upon request. Even if your system is based on the delivery of HTML pages, you do not necessarily have to store your content in HTML format. When information is requested, your web server can dynamically create and deliver the content. Perhaps you want your content to live in XML format and to have your web server apply XSLT stylesheets to generate the HTML output on-demand. The key point is that the server is a computer that can do whatever somebody programs it to do. It will deliver the content as instructed, and the end user never has to know anything about your XML (or anything else on your web server).
- Usage statistics (haven’t you always wanted to know?). As the intelligent hub of your publishing system, a web server application can keep track of what information is requested and when.
- Potential for real interaction with users. Web-based communication is a two-way street. Most notably, the lane that heads toward your web server can carry much more than simple page requests. You might want something basic, such as a simple commenting facility or an advanced collaborative authoring environment such as a wiki. In either case, web server architecture opens the door for such interaction.
The potential problems with the adoption of web-based publishing are similar to the adoption of any new technology. For example, you might discover that:
- You never really needed it. Who can speculate how much time and money the gee-whiz vampire has drained through the ages.
- You don’t have the skills to plan it/build it/maintain it/and so on. Dynamic publishing may be the beginning of a paradigm change. It could test your skills and/or the skills of your staff or the amount of money you have to hire those skills.
As always, careful planning should allow you to avoid these problems. Naturally, any major change will involve some risk, so don’t expect a bump-free road. The idea is to ensure that the cost of mistakes is eventually outweighed by the payoff in benefits.
Tools and Technologies
To implement a web-based system, you will need some new toys. I broadly address some technical considerations, mainly as a starting point for further research.
Obviously, you’ll need a physical web server. A web server is nothing more than a computer on a network that has special software to handle web exchanges. This software, discussed later, listens for requests and responds as programmed.
If you are deploying a small system on your office LAN, the web server could conceivably be your workstation PC (local IT policies notwithstanding). Conversely, if you are publishing to the world, perhaps you need a hefty third-party computer on some giant internet server farm. In either case, considerations include:
- Host network type. Your consumers must have network access to your server. Typically, the smallest possible network is the most appropriate, as it usually reduces management and maintenance overhead. For example, if all your consumers have internet access but are also on the same LAN, a local server might be more cost-effective and manageable than an internet-based deployment.
- Operating system (OS) platform. The OS is related to the type of web software you intend to run. Any mainstream OS is a viable alternative if the intended software will run on it. In heavy-use production environments, Linux and UNIX are often favored for their stability and, in the case of Linux, that it is open source.
- Capacity. The basic distribution of HTML web pages requires few resources. For a simple web-based system, you should be able to share an existing computer that is already dedicated to other tasks. On the other hand, if your system performs heavy dynamic processing (XSLT, database queries, and so on), you will require more system resources. Naturally, the burden on the system increases proportionally with the amount of use, especially simultaneous use. If you plan to deploy a heavyweight publishing system, you should consult an IT expert first.
The web server software is likely to represent the greatest challenge for planning and development. You must choose or build software that generates what users see when they request your content. The software is the intelligence that drives the entire process, which in the end, tends to represent the intelligence of those who designed it (for better or for worse).
Complicating the planning process is the fact that the permutations of software are endless. In fact, you may choose to develop your own entirely, in which case the planning process is completely open-ended. More than likely, though, you will use an off-the-shelf (OTS) product, whether as the main application(s) or as a framework on which to build your own customizations.
If you are looking for a (virtually) turnkey OTS solution, your options are limited. Furthermore, you likely will be restricted in your options for data storage and output format. A primary example is a TWiki® system, designed not only for web publishing but collaborative web-based authoring as well. For the most part, it is ready to deploy; however, without customization, you will be stuck with its current functional model. TWiki is very good at what it was designed for, but it may not be appropriate for general technical publications.
The more likely candidate is a “generic” web server that you can customize into a system that is distinctly yours. This type of software has built-in functionality to handle the low-level tasks associated with the receipt of requests and the packaging/transmission of response data, but virtually everything in between is your responsibility to develop. This means that you design all source and output formats, giving you full control over the appearance of published content. For simple web-delivery systems, you may only need to generate a few stylesheets. For more complex efforts, you may need to develop your own software code. The exact requirements in any given case are directly related to the functionality desired.
Options for OTS web server software include, but are not limited to, the following:
- Microsoft Internet Information Services (IIS). IIS is a mainstream commercial package with a strong track record and rich feature set. Advantages include its support structure and integration with other Microsoft products and technologies. Disadvantages may include its price tag (especially when compared to other options that are open source) and the fact that it operates only on the Windows platform.
- The Apache Web Server and the Apache Tomcat “servlet container.” Both from the Apache Software Foundation, these packages are open source, extremely robust, and currently in heavy production use around the world. For the enterprising do-it-yourselfer, one of these servers may be the best option, due to their ready availability and very capable feature set. In addition, these servers can run on many different operating systems, including Windows, UNIX, and Linux. Disadvantages include the requirement for substantial technical expertise. Open-source software tends to have less support, such as detailed documentation and dedicated support staff.
Please note that I provide a very high-level overview of potential options, as a starting point only. The actual deployment of any web server is no trivial task. With proper planning and perseverance, though, an enterprising individual or team with moderate technical skills should eventually be able to pull it off.
Considerations of Scope and Scaling
While predicting the future is never possible, you should always make an effort to implement a system that will scale to future needs, especially if you expect use to grow. The following are some questions you might ask yourself during this process:
- Will the software and hardware I’ve chosen support the anticipated demand from users?
- Will the system allow me to readily expand the amount of content published, especially with respect to automation? For example, if I am planning a very simple system where I externally generate HTML pages and then post them on the server, will I be able to handle this workflow if the number of pages doubles? If the answer is no, do I have options to automate part or all of the page generation process? Even better, can I move toward dynamic generation of the HTML by the web server itself?
- Will the overall architecture fit with other processes and workflows? For example, if I expect a movement to a large-scale CMS in the next few years, will the fancy new web server remain compatible or relevant afterwards?
- Is the structure of my source content adequate to meet present and future requirements? For example, if I am considering XML, should I prefer a standard like DITA for its built-in reuse capabilities and community support? Or, would a custom-built structure definition be the better choice?
- Do I have the staff to maintain this system going forward? If not, do I have the resources to pay for outside help?
While it is difficult to fully answer any of these questions, considering issues such as these should at least improve your planning and decision-making process.
A Real-World Case Study
Spirent Communications of Rockville, MD manufactures hardware and software products for broadband service providers who use them to test their networks and diagnose problems. Often, a complete solution is very complex, with many different components in different locations interacting in a carefully orchestrated manner. Because of this interaction, an employee working on any particular component frequently requires the absolute latest technical data on other components. Additionally, following an official release, the final technical data is required by internal and external personnel alike.
Historically, though, the Technical Publications department had a problem—its publishing model did not support the immediate needs of internal consumers. It did provide comprehensive documentation at a product or solution release, but by that time it was too late to meet the internal demand during the development cycle.
The result was predictable: heavy redundancy. Because internal engineers and testers couldn’t get the information they needed from TechPubs, they maintained and shared their own “personal” documentation. Over the years, the cost of this duplicated effort became substantial, especially when coupled with the confusion that often erupted due to multiple information sources. A less tangible cost was the effect on the reputation of TechPubs, which seemed to only partially fulfill the mandate inherent in its title.
To fix this problem, TechPubs had to figure out a way to serve both internal and external consumers, both during and after the release cycle. The answer took shape with the adoption of XML as a source format and the help of a web server.
- TechPubs took control of all technical content and compiled it into a single XML-based data source.
- The active data source was integrated with a web server on the corporate LAN, from which internal consumers could retrieve the absolute latest content on demand at any time. The choice of XML as a data source format was deliberate in anticipation of this architecture, as XML provides a straightforward path to HTML via XSLT.
- The new data source was also developed to remain compatible with the existing static publication workflow, such that traditional publications could still be generated at product release. Again, XML was chosen specifically for its ability to produce PDF and other static formats.
Consider the diagram in Figure 2. With the implementation of this system, TechPubs was able to take complete control of the information source yet still provide immediate access to all consumers who needed the latest content on demand. With the new availability of content, engineers were able (and quite happy) to abandon their private documentation sources and focus on engineering. Furthermore, the system had the unexpected side benefit of allowing technical content to move through the development process like the rest of the product features. With all parties using (and refining) the same documentation set during development, the content is heavily reviewed and effectively ready for final, static publication as soon as the product release date arrives. Through the benefits provided by this system, Spirent has recognized dramatic cost savings within Tech Pubs and throughout the organization. What’s more, the system has also led to an increase in quality, both in content accuracy and user experience. Through it all, the web server was and remains the key. Without a move towards centralization and on-demand access, none of these successes would have been possible.
Figure 2: Spirent Web-Based Publishing Solution
Even if you don’t see an immediate need in your organization, you should at least consider staying up-to-date with related industry trends in this area. Some major players have already adopted web-based methodologies on a large scale; for example, have you noticed what happens lately when you launch the Help for a Microsoft or Adobe application? Perhaps you have also heard about the movement toward “cloud computing,” another web-based environment? Web-based technologies show no sign of decline, so it is likely that they will (and should) cross your path eventually.
Russ Ward is an experienced technical writer and structured technologies developer. He has spent many years working with structured content to maximize efficiency in the techcomm environment, both as an employee and as an independent consultant. He is also an experienced trainer and speaks frequently at conferences and other peer events.