DCLNews Editorial January 2010
Republished with permission from Data Conversion Laboratory, Inc.

XML and SGML standards are growing in popularity as a way to manage large sets of documentation. But calculating just how much it will cost to convert your documents to XML or SGML can be confusing; a multitude of factors interact to determine the per-page price of any conversion project.

As if this weren’t complicated enough, common myths and misconceptions about document conversion prices make it even more difficult to arrive at a realistic estimate.

To give an extreme example, one DCL representative at a military conference heard conversion costs being estimated at $300 per page. True, military documents must be ITAR compliant and are therefore considerably more expensive to convert than documents which can be handled offshore—but even for an ITAR-compliant military conversion, $300 per page is a great exaggeration of a typical per-page price.

XML/SGML Standards
Air Force MIL-STD-38784
Army MIL-STD-2361
Marine Corps USMC V1
NAVSEA Class 2 (C2)
Structured Product Labeling (SPL)
TEI and TEI Lite

The misinformation runs both ways, it seems; it’s also not unusual to find those who think that automated conversion tools are magic bullets that allow for perfect conversions to be performed in-house, at the push of a button, and for only the cost of the software. This too is misleading.

These myths surrounding conversion costs do a disservice to everyone who stands to benefit from more efficient, more functional data.The conversion that costs $300 per page is a rare case indeed—in fact, even the more-expensive ITAR-compliant conversions required for military documentation rarely cost the client more than $10–20 a page. (ITAR-regulated conversions always cost more because the documents and their data must not leave U.S. soil.) For materials that can be handled offshore, prices can drop to as low as $2–8 a page.


Even once the price difference between military, ITAR-compliant conversions and regular, non-ITAR regulated conversions is taken into account, there are still many other variables that affect the price of any given conversion project.

To make the topic more manageable, DCL has assembled the following summary of conversion cost variables and their effects. Although these variables and their effects are fairly consistent across a wide range of conversion projects, it is also true that different industries have unique conversion needs and their own corresponding cost factors.

If you would like a detailed account of conversion cost factors specific to the military and commercial technical documentation community, please download our free white paper on conversion costs. For more information on how your specific conversion needs will affect your project’s per-page price, please contact us and we will be happy to provide you with pricing information based on your project’s unique requirements.

Source material

As a general rule, the more sophisticated the source format, the less work needs to be performed in a conversion, and the lower the per-page conversion price will be. Paper and image-only PDF are the most expensive source formats to convert, PDF Normal files are a little cheaper, and documents created by word processors and publishing systems are the cheapest. If your source format is already in XML/SGML and you are converting to another XML/SGML format, your conversion might be done very cheaply—provided your project is large enough (thousands of pages) and your source documents contain all the information required by your target format.

Target format

Some target formats are more labor-intensive than others. Structure-based DTDs or schemas identify chunks of information by their structure, and so the conversion process is fairly straightforward and conversions to structure-based target formats are cheaper. Content-based DTDs or schemas cost more, since they are more complicated to convert; tagging in a content-based DTD must take into account the actual roles played by given chunks of data.

Manual types

Since each manual type must be tagged in its own way (necessitating a sort of “mini-conversion”), the more manual types you are converting, the more your conversion will cost. Similarly, the more target DTDs or schemas, the higher the per-page price. Some specific types of manuals cost more to convert than others due to their different tagging requirements. The degree to which the source manual type conforms to the target specification also has a bearing on conversion cost; the greater the conformity, the lower the per-page cost.

Technical content

If you are converting to a content-based DTD or schema and your documentation set includes highly technical or subject-specific material, a review by experts in the field may be necessary in order to ensure that the content is correctly interpreted and tagged. Those performing quality assurance may also need to be familiar with the documentation subject so that they can notice and correct any errors that may have occurred during conversion.

The services of content experts are more expensive than the services of those with more general knowledge. As a result, conversions that require content expertise will cost more, and those that do not require any subject-specific expertise will cost less; however, the additional cost of content experts can often be greatly reduced by separating the content into portions that require expert review and portions that do not.

Graphic conversion

Your cost for graphic conversion will vary depending on the type of graphics required by the target format. For example, vector graphics and composite graphics are among the most costly graphic types to convert; simple raster conversions are less expensive.

Content reauthoring

Like graphic conversion, the decision to reauthor some or all of your documentation content could add hundreds of dollars to the per-page price. While content reauthoring produces content that is perfectly compliant to your target standard, the price of reauthoring (which often costs as much as authoring a whole new manual) means that considerable thought should go into evaluating this option.

Automated conversion tools

Automated conversion tools can be an appealing option to those looking to cut conversion costs. However, these tools can be risky if you’re not careful; it is critical to be aware of the limitations of such tools, as well as their accompanying costs. No conversion can be completely automated, and most automated tools require significant investment in other resources before they can produce quality conversion results.

There are many situations in which executing your conversion in-house with the help of automated conversion tools is a cost-effective solution. But a failure to adequately take into account ancillary expenses—such as the cost of engineering the software to suit your needs, the cost of training personnel to use the equipment, and the cost of performing quality assurance on the converted documents—could lead to a much higher overall price tag than the one found on the software box.

Other cost factors

It’s important to keep in mind that there are some costs which are unavoidable in any conversion project. These include costs associated with quality assurance (to ensure the fidelity of the converted data—of course, these costs decrease as the accuracy of your conversion increases), infrastructure development (in the form of an XML/SGML authoring and rendering environment), and training (so that your team can implement and sustain an XML/SGML publishing environment).

In conclusion

There is no one-size-fits-all cost for conversion; an oversimplification of the issues that determine conversion cost could just as easily land you “in over your head” as make a perfectly reasonable conversion seem out of reach.

Click here to sign up for DCLNews.