Content Management on the Other Side of the Fence: Copyright and Licensing Controls
The purpose of this article is to assist you and your organization with managing some of the issues and emotions associated with publishing electronically. For someone accustomed to publishing on paper, you need to make a number of psychological adjustments for electronic publishing to be embraced whole-heartedly.
I will work through some of the issues associated with protecting copyright for publications distributed in electronic formats both by download and on physical media. Hopefully, you will then feel more comfortable with distributing your publications electronically.
As a publisher, your organization may be deriving an income stream from the sale of your publications.
Most publications which are sold today are sold in paper format. An electronic format is cheaper to produce and distribute and could potentially offer greater functionality to the reader. Cheaper means less energy consumption and less stress on the economy and the environment.
However, publishers are extraordinarily hesitant to provide an electronic format for fear of piracy and a dislike of change. There is also an unwillingness to trade something which can be touched for something intangible. The love of a publisher for a stack of books is something which cannot be underestimated. Perhaps it has something to do with the glue in the binding.
Distributing a publication on physical media like CD-ROMs can help minimize the withdrawal symptoms. More recently, distribution on USB memory sticks seems to be capturing the imagination of publishers and consumers.
For the electronic format to be successful, it must be more attractive to readers than the paper version. I would have to say that with the electronic formats in widespread deployment today, this is far from true.
For those of you who don’t sell your publications directly, I guess you have already made the transition to an electronic format to save money on distribution. Since you are not selling your electronic format, there is perhaps less incentive to ensure its usability than for someone who is. Nevertheless, I would encourage you to at least incorporate the usability of your outputs into your ROI calculation so that better electronic formats can be delivered.
Before we discuss licensing, we need to consider what, in a world of wikis and blogs, is publishing anyway?
Publishing is associated with the distribution of content. More importantly, it implies that there is a distinction between unpublished and published content. We expect published content to be
- free of typographical and grammatical errors
- free of factual errors
- representative of the opinion of the organization that publishes it
- clearly distinguished from previous editions
If the publication is of a legal nature, these are obvious requirements. When the publication is a reference document used to construct expensive, mission-critical systems such as aircraft and hospital information systems, it is also true. These requirements mandate some separation between the free-for-all editing in the authoring environment and the controlled release cycles of a reference publication.
Component Content Management Systems
In the authoring environment, a component CMS manages content but, in a way, it also manages people.
A CMS manages content for a moderate number of users. For these users, it provides collaborative creation, editing, review, searching, selecting, and publishing. All of these users belong to some organization (business or community). All of these users are subject to rules set up by that organization and so abide by the rules that the CMS (and its administrators) impose on them.
When content is published, it is usually provided to a much wide user community of people mostly outside the organization. Whereas the number of people served by the CMS might be 10-1,000, the number of users is much greater, perhaps 100-10,000,000. These users are not subject to the organization’s management policy. They may even be anonymous. For instance if the publication is distributed through a third party, like a retail chain, their identities are unknown to the publisher.
The publishing process is like a fence between the organization and its external user community.
There are two ways of distributing the content electronically: web publishing and download.
When the publication is distributed via a web server, a release of a publication is rendered as a large number of HTML and image files. The publisher or distributor has to supply computer resources for web servers servicing millions of users. Every time a user navigates to a new page, the distributor has to manage the download of an HTML page and its associated image files.
Packaged electronic formats
In a packaged electronic format, a release of a publication is packaged as a single file in a format like PDF, eComPress, CHM, and so on. Users download the package and access it on their own computer. After the download, users supply all of the computer resources required to navigate and search the publication.
Integrity and authenticity
Now, the component CMS used to prepare the publication goes to a great deal of trouble to provide consistent releases of the content for publication. The authoring history of each component is recorded, and the release can be reconstructed at a later date just from the released version. When the publication is distributed, its integrity and authenticity need to be maintained and should be verifiable by the user.
This verification requires the publication to have the digital signature of the publisher with a timestamp included in the signature. Achieving this for every file in a web publication would be quite a challenge. One reason is that obtaining the time stamp from a trusted time server takes a few seconds and this multiplied by thousands of files would make for several hours of time stamping. You would also not be very popular with the operator of the time stamp server.
The publisher will usually want to control who has access to the publication. With web publishing, this is easy to achieve. However, it is not so easy to control what they do with the publication once users have access to it, something the publisher may also be interested in. The text and graphics in the publication are provided to the user’s browser in HTML and JPEG files. These are unencrypted, and the users are able to copy, print, and share this content as they desire.
If the publisher wishes to charge a fee for accessing the content, doing so is also difficult to achieve in the web environment where content is generally expected to be free.
For this reason, web publishers seeking reimbursement usually resort to advertising. The use of advertising in a publication to provide revenue is often not appropriate. Some examples of where advertising is common are in educational, commercial, and reference publications. Advertising imposes an ongoing distraction cost on the user and wastes space on an electronic desktop in place of an upfront fee.
For these reasons, packaged electronic formats downloaded for access on the user’s computer should be the preferred means for electronic delivery.
The usual term for publications delivered in electronic format is eBook. I am somewhat reluctant to use this term since it has significant negative connotations associated with current and previous implementations. In particular, these implementations lack features which I consider to be essential (see Figure 1 for a good attempt at migration to an electronic format).
In an electronic format you have to reproduce the good characteristic of books, such as rapid access, as well as features which go beyond what is available in a book and so motivates a user to prefer an electronic format over a paper one.
For this reason, I prefer to use the term smartbook to describe a publication delivered in electronic format with advanced functionality.
A smartbook should have excellent navigation and search functionality and at least reasonable print capability.
- It is good if you can copy material to the clipboard and then to other desktop applications with formatting preserved.
- It is also useful if you can make notes in the publication.
- It is even better if the publication supports citation and transcription as described below.
What is the first thing you do after a copying a quote from a source? If you are going to reference the quote in a formal work, you have to record the source. A smartbook will perform this step for you, appending the citation as a footnote. You can see some of this functionality becoming mainstream in Microsoft One Note where extracts from web pages are followed by the URL for the source. Unfortunately you still have to type in the retrieval date.
A smartbook may also format the footnote to suit the destination. For example, when copying XML from an XML code sample, you can format the footnote as an XML comment so that it can sit after the XML code indicating where it came from, yet still preserving the validity of the XML.
In some eBook technologies, annotations are tied to a particular edition of a publication. Unfortunately, this connection means that when the users receive an updated publication, their notes remain with the old version. In a smartbook, they are transcribed to the new edition. If the section an annotation was attached to has been deleted, the annotation is relocated in some sensible manner to ensure that the user’s intellectual property is preserved.
If the publisher requires it, a smartbook can also contain a facility for enforcing the license agreement, called digital rights management, DRM.
When you implement a DRM system, you need to consider the cost of setting it up, running it, and shutting it down. Running costs accrue from basic issues like dealing with users who have lost their access keys. These are pretty much unavoidable.
However, if you adopt a strategy which enforces the maximum number of installations, you will require a system which interacts with the user after purchase which will be a significant extra burden on you and the user.
You also need to consider what will happen if and when the installation-counting system is shutdown. Users who loose access to their downloaded publications if you decide you can no longer afford to run the DRM server will be unpleasantly surprised. Search the web for “Unfortunately, due to a reassessment of priorities in the current economic environment” for an example of this surprise.
The publisher needs to decide how much money and effort to put into the DRM system, especially since the effort may impact the provision of other facilities which directly benefit the user like citation, transcription, and so on.
Of course, a DRM system which ensures that the user automatically adheres to the software license agreement and stops a corporation getting sued for copyright infringement is also beneficial to the user, but in a more roundabout way.
So how strong does a DRM system need to be?
To make a sensible choice, we need to compare the vulnerabilities of a potential system with those of a printed publication and not go overboard. You need to consider two points.
1. Any DRM system can be cracked
The most secure DRM systems use dedicated hardware with software which is not under the user’s control. However, if the content is interesting or valuable enough, smart but naughty people will be motivated to grind and probe their way into the system.
In the Australian vernacular, there is an expression, dream on, which is an exclamation indicating that someone is being unrealistic, which perhaps describes the situation.
2. The existing paper version can easily be copied
Every time you worry about duplication of the electronic format, remember that electronic copies can also be made of the paper version with moderate effort.
Guillotine the spine off and run the stack of sheets through a Xerox machine or scanner. Add some OCR, and you’ll get a searchable PDF.
These are some things to ponder before forcing your readers to read the publication on specialized hardware divorced from their usual computing platform or subjecting them to a draconian DRM system.
So I suggest accepting that the typical user is basically honest and is not a criminal and then applying controls which enforce the license conditions or approximate the license conditions by being a little less restrictive in the interests of a more cost effective implementation. I call this approximation fuzzy DRM (see Figure 2).
If you find the concept of fuzzy DRM alarming, think about the security model that you apply to your house. You probably lock the door when you leave but the windows are made of glass and easily broken. Before you cover the windows with steel bars, you make a risk assessment balancing aesthetics, safety, risk of robbery, and costs.
You need to do the same when choosing and configuring a DRM system.
Acceptable License Conditions
When the users have licensed some content, they may expect to access it on all of their electronic devices.
These might include desktop, laptop, and pocket devices including mobile phones. Providing advanced functionality on all of these platforms is something of a challenge for smartbooks.
Fortunately for smartbook developers, the novelty of reading on a very small screen is something that wears off quickly.
Another challenge is managing the number of installations, i.e., if concurrent usage is to be allowed and controlled.
If you decide to issue a concurrent user license to an organization, you need to take into account the characteristics of the electronic format. While a single paper copy might be thrown from desk to desk in a single office, that won’t suffice for the whole building.
However, an electronic copy can be passed from one user to the next almost instantaneously. To make a concurrent user license viable, you need to stipulate a minimum usage time. You can also think of a concurrent user license as a library with a minimum loan period. You might make it one day or one week.
Once you have defined what concurrent means in a practical way, the organization can estimate how many concurrent users it requires.
The operations the user can perform on the publication can be controlled by a smartbook so that only authorized users have full access.
We can divide these operations into broad categories such as display, search, annotate, print, and copy. Of these, the right to copy is the most contentious one.
A publisher in the USA has the right to control its publications nicely enshrined in the United States Constitution, Article I, Section 8, Clause 8: “To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries;”
As a publisher, your rights are somewhat moderated by the doctrine of fair use.
In the United States fair use is described in the United States Copyright Act of 1976, which permits republication of excerpts without permission under certain circumstances.
To ensure customer satisfaction, the publisher has to provide a smartbook which supports fair use by allowing copying to some extent.
However, the smartbook has no way of knowing whether the exported content might be republished in violation of the publisher’s copyright. The publisher simply has to trust the user not to do so.
Generally, the more information the users provide about their identity and the computers the smartbook is used on, the more the publisher feels able to trust the user.
When no usage information is provided, the smartbook may disable the copy function entirely or make exporting of material an onerous task.
Two examples of the information the user might provide are
- a user identifier
- a computer identifier
Provision of this information allows a smartbook to restrict access to specified people and/or computers, in exchange for enabling the copy function.
Moving from paper distribution to electronic format is desirable for many reasons, especially environmental ones.
Moving from a legacy electronic format to a smartbook will increase the viability of the electronic format and improve the efficiency of your users.
So, ask yourselves:
- Do I have printed publications for sale, which could be sold in an electronic format?
- Do I have electronic publications given away free which could be sold if they were in a better format?
About the Author
Eurofield Information Solutions
Robert Minard is currently the Software Development Manager at Eurofield Information Solutions (EIS). He is responsible for the technical aspects of EIS’s electronic publishing tools and services.
After receiving a BSc. (Hons) in Physics in 1980, Robert was awarded a PhD in Electrical and Electronic Engineering in 1984 by the University of Canterbury for the development of an ultrasonic imaging algorithm to be used to detect breast cancer. At the University of Sydney, Robert lectured in physics and designed and developed control systems for the Sydney University Stellar Interferometer and an adaptive optics subsystem for the Anglo-Australian Telescope.After joining EIS as a senior software engineer in 1996, he led the design and development of eComPress–a compressed and indexed electronic publishing system for reference publications.