How Do Content Management Systems Work?

CIDM

December 2000


How Do Content Management Systems Work?


CIDMIconNewsletter Tina Hedlund, Senior Consultant, Comtech Services

Introduction

Single sourcing, or content management, is the current hot topic in technical publications. You want to manage content, increase efficiency, reduce costs, and deliver information more effectively to customers. But-single sourcing also means becoming familiar with an entirely new range of technologies: XML authoring tools, database repositories, Web server publishing environments. It’s enough to give a senior manager a migraine.

The complexity of content management systems means that you need to delve into unfamiliar territory, even if you rely on the techies on the staff to do the in-depth investigations. You are still responsible for the decision-making. Is component management better than document management for your needs? Should you author in XML? Do you need a database? What kind of database is best-relational or object-oriented?

“Has someone written Database Publishing for Dummies?” you cry.

In this discussion, we hope to shed some managerial light on the technology issues. It’s sort of Content Management Systems for Managers (who aren’t dummies but have too little time for a major education into database technology).

A complete content management system consists of three essential parts (see figure below):

  • an authoring front-end
  • a repository or database to store the content
  • a back-end publishing system

You can purchase one end-to-end solution from a single vendor, or you can mix and match three or more different solutions to create a unique tool to meet your requirements. Let’s look at each of the parts in turn.

The Authoring Front-end

We have become accustomed to selecting authoring tools to accommodate both ease of input and ease of output. A typical tool like Adobe’s FrameMaker provides a sound word processing system with strong support for print publication. Given a few add-on products, we have used FrameMaker to produce HTML for browser access in addition to the ubiquitous PDF for Web and CD-ROM delivery.

FrameMaker is a typical desktop publishing system. It is based on a book/page paradigm and closely links authoring with publishing. We are now looking at authoring tools that allow writers to structure their documents easily and create modular information using standardized XML templates (DTDs or Document Type Definitions). We need tools that support adding attributes to components and providing links to related information. Even though we may not be using the authoring environment for final production any longer, we still want systems that have usable on-screen formatting so that our writers don’t have to author in code. This interim on-screen formatting is often referred to as a rendering, a temporary formatted view that makes writing and editing easier.

In addition to the pure authoring environment, we want authors to be able to have easy access to the middle ground-the document repository or database. In most cases, authors should be able to access the content repository through plug-in software that is fully integrated into the authoring tool. The plug-in is transparent to the authors; they simply check content into and out of the repository by clicking File/Open and File/Save. The plug-in connects to the component-management system and allows the authors to access and view content in ways that are already very familiar to them. Usually the plug-ins are created by the content management system vendor, but they can also be created and maintained by the authoring tool vendor.

When the authors check their documents into the component-management system, the documents are converted into reusable components based on a set of rules that you have established either within the authoring tool or the component-management system. The rules determine how components are stored. You may want to choose a component size based on a heading level, or you may want to make every tagged item into a component. Some component-management systems allow you to determine the rules yourself and some have a set system that you are required to use.

Management System

The second part of the three-part structure is the component-management system, often referred to as a content-management system. The content-management system (CMS) is simply a software layer that controls access to the database, whether relational or object-oriented. Key file-management functionality is usually built into the CMS, including

  • check in/check out so that only one person can update a file at a time
  • password security to restrict access to files
  • versioning to maintain a history of all changes to the content
  • the ability to integrate with a variety of different workflow-management software packages

The CMS vender provides software that allows you to connect your authoring tool to different database technologies. This software allows the author to access the information in the database through the CMS and then store the content after it has been updated or changed.

Database Technologies

The database/repository tier of the three-part solution is often the most difficult to understand. Database technology has long been the realm of IT departments and far removed from anything information-development managers have needed to work with.

As you and your staff begin to investigate single-source technology, you’ll quickly learn that you can choose CMSs that work with either relational or object-oriented databases. Today, you’re more likely to be offered a solution that is a hybrid of both.

Relational databases
Relational databases have been the standard in database technology since the late 1960’s. The relational database model made it possible to maintain data integrity and to store reliable, consistent, and accurate data. Data is stored in tables much like a spreadsheet, and pointers are used to locate the content of fields, records, reports, or any combination of the data by simply recalculating where the data exists within a table. The example below illustrates how data is stored in a relational database.

 

ISBN

Book Title

Publisher

Price

1571691618

C Primer Plus

Sams Publishing

29.99

156609156

Windows 3.1 Bible

Peachpit Press

28.00

1565298268

Using the Macintosh

Que

34.99

The ISBN, Book Title, Publisher, and Price are fields that can be thought of as columns in a spreadsheet. Each “book” is a record, or a collection of fields describing an object; the book can be thought of as rows in a spreadsheet. Multiple tables can exist and the relationships between the data are maintained through something called keys.

 

ISBN

Checked Out to

Due Date

1571691618

John Smith

February 3

1566090156

Jane Doe

January 10

1565298268

Cindy Black

March 1

In this example, the key that establishes the relationship between the two tables is the ISBN field, which contains the same data in both tables. By establishing a relationship between the two tables, it is possible to access data from both tables. Every time data is requested, the database goes out and individually locates each data request and assembles the content to be viewed.

Relational content-management repositories or databases work in the same way but with more data in the fields than is typically used in financial database applications. For example, if your department produces content for a User’s Guide that contains an introduction, conceptual information, and a procedure, the content would be represented in a relational database table like this:

 

Introduction

Conceptual Information

Procedure

This User Guide will help you…

The theory of…

To replace a memory board…

It’s nice to know a little about the database structures behind a relational database, but this understanding is not necessary for you to use the database. The CMS you purchase will create the table structures for you. You won’t need a Database Administrator (DBA) to create the data table structures like you might if you were developing a typical financial database application. The storage and retrieval of all data is handled through the CMS.

What you might need to know is that retrieving all the data from storage spots in relational tables is sometimes a slow process. That means that a pure relational database may be slower in terms of search and retrieval than a similarly sized object-oriented database. The speed of search may be important for your authors when they look for reusable content. However, it is likely to be more important in a Web-based publishing environment in which content is being retrieved from the database when the users need it.

Despite a possible speed disadvantage, relational databases have many advantages in a CMS environment because you can spread the databases over multiple servers. If you have a large staff or your staff is spread out across many geographies, a relational database may be the best solution.

Object-oriented databases
The other primary player in the database world is the object-oriented database. Object-oriented databases have their roots in industries that manipulate large multimedia files like geospatial and financial trading applications. One of the most obvious differences between relational and object-oriented databases is the ability of object-oriented databases to store larger pieces of data. Relational databases typically have character or file length restrictions that object-oriented systems do not. The absence of character/file restrictions allows you to store large data files in the object-oriented database, especially complex graphics and multimedia files.

Additionally, object-oriented databases, which are typically programmed in object-oriented programming languages like C++, work more efficiently with large objects. The object-oriented programming layer that sits on top of the database can be less complex to create since it integrates more naturally with the database design. Operations can be performed directly on the objects without translation. Putting large objects into relational tables also requires about 25% more time and programming resources, because the complex data structures have to be mapped to the less complex relational table structures.

Because they store objects rather than small pieces of data, object-oriented databases do not store data in the same table structures as relational databases. Bundled with the objects are object identifiers (OIDs) generated by the database system, which make the data accessible for quick manipulation by applications. Relational databases rely on the key values to identify data and lack the built-in data identity that object-oriented systems offer.

This object-oriented approach to storing items also helps when delivering modular content on the Internet. Individual modules, that is, introductions, concepts, or procedures, can be stored individually within an object-oriented database, just as they can in a relational database. However, the strength of an object-oriented database is the ability to mix and match those modules in a way that persists in the database. When a larger object, made up of smaller objects or information content units, is created once, it does not have to be rebuilt every time a request for the larger object is made. By storing collections of smaller objects, the database can deliver information faster and more efficiently.

Another characteristic may increase database speed and efficiency in the object-oriented world. With relational databases, searches are always processed on the server. Object-oriented database searches can also be processed on the client machine. By doing work on the client machine, object-oriented databases free up memory and hard disk resources on the server, allowing for efficient processing of data on the server.

Critics of object-oriented databases cite their relatively new arrival in the database world as the biggest reason not to consider this technology. Relational databases have years of development dollars behind them, making them very robust and reliable. What you learn, as information-development managers, is that the decision between a relational or an object-oriented database for a CMS is not obvious. Often, you will make the decision based on other factors such as the familiarity of the IT organization with relational but not object-oriented technology. Or, you may already have a relational database that you can use for the CMS.

There is still another option, however. Because many object-oriented features can be recreated in relational databases in a less integrated, but not necessarily less effective way, we can choose from relational/object-oriented database hybrids.

Hybrid database solutions
Many database vendors have begun delivering hybrid solutions that they hope combine the strengths of both models. These systems store data in relational database table structures, but they index and search data based on an object model. Software extensions are added to relational database systems to circumvent the data or character limits usually associated with relational systems, thus allowing for the storage of larger pieces of data. Many of the primary database vendors like Informix, IBM, Hewlett-Packard, Unisys, Oracle, and UniSQL offer object-oriented databases that work in this way but still translate the application language structures into database structures.

Because pure object-oriented databases still make up a small but growing portion of the database industry, many relational database vendors hoped that this “update” strategy would help them compete for the object-oriented database market. However, those who need to manipulate large pieces of data tend to buy object-oriented systems, and those who do not need to manipulate large pieces of data are staying with the tried and true relational database systems.

Publishing tools

The first part of our three-part structure is the authoring system, and the second part is the database and the CMS, the third part is the publishing system. In many cases, a complete solution will include tools to process content into print, Web, CD-ROM, or any number of output formats, but these tools do not necessarily need to be purchased as part of the CMS. Many vendors, such as Arbortext, sell an XML authoring tool and publishing tools, but do not provide CMS or database products.

To deliver to the Web, most CMSs integrate with Web servers so that the most up-to-date content can be delivered dynamically to the Web from the content repository. The content is usually translated on-the-fly for presentation in a browser.

The step between the repository and the publishing system requires the participation of the information developers, however. You will need to define the mapping between the way the content is stored, perhaps as XML objects, and the way you want it to be formatted in a variety of output environments. On output, the XML tags are translated into postscript, HTML, WML, HTMLHelp, or any other format that your users need from you.

Conclusion

The comprehensive solution you select is determined by the output you want to produce and the customers you support. The authoring tool, content management system, database technology, and publishing tools are all determined by the kind of publishing you want to do. Some tools work better together; for example, a CMS written in an object-oriented programming language will work more efficiently with an object-oriented database. Many Web-based publishing systems rely upon dynamic delivery of content to support users, which may require the type of fast access that you can achieve through an object-oriented choice. But, your choice needs to be balanced with the recognition that relational databases are often more stable, reliable, and easier for a traditional IT department to maintain. Once your information model is in place and the information is structured for optimal reuse, then it’s time to turn your focus on the authoring tools, content management system, and publishing tools you need. Remember that the real work will be to create a new development process and design new information products. The technology is the easy part. Just ensure that the technology you select really does meet your needs and your user requirements.

Remember that many vendors do not provide complete end-to-end solutions but have agreements and partnerships with other vendors. The agreements allow them to integrate easily with other software applications to produce the solution you are looking for. CIDMIconNewsletter

We use cookies to monitor the traffic on this web site in order to provide the best experience possible. By continuing to use this site you are consenting to this practice. | Close