Using SharePoint as a Platform for DITA Solutions
Steffen Fredericksen, Content Technologies
DITA is rapidly being adopted by companies around the globe as the standard for technical documentation. Simultaneously, Microsoft SharePoint 2007 is gaining acceptance as the enterprise collaboration platform of choice. So the obvious question is: Can you successfully combine the two “standards?”
This article is an attempt to help you answer this question by presenting two different DITA scenarios—one where the technical documentation department uses DITA inside their “silo,” and one where the silo has been broken down—and then showing how SharePoint can be used to break down silos and get closer to the full potential of DITA for a good return-on-investment.
This article assumes that you have a basic understanding of what DITA is and how it can be used. If not, please have a look at <http://dita.xml.org> before reading on.
Note: I am not trying to sell you SharePoint, but if your company has chosen SharePoint and you have chosen DITA, I hope this article will show you that combining the two is an option that you should give serious consideration.
Scenario A: A Quite Common DITA Scenario
In my work with companies around the world, I have often been faced with a scenario like this:
Figure 1: A Common DITA Scenario
The technical documentation department of company X has been testing the DITA standard for some time, with good results:
- Content is published and distributed to customers as both PDF and as HTML Help through the DITA Open Toolkit (DITA-OT).
- A number of topics are reused, for multiple products, in the same product family. This reuse creates cost savings, and translation costs have gone down.
- The people in the department use a first-rate DITA-capable XML authoring tool to create and edit DITA topics.
- So far, no CMS solution has been used. Topics and maps are stored on a network file server.
- When trying to involve other departments (legal, engineering, production, marketing, sales, customer service, and so on) in the DITA project, a wall of resistance against the switch to special XML editors is encountered.
- Input for technical documentation is received in many different formats, including Word, Excel, and PDF (and let us not forget handwritten notes).
- Revision cycles (still an average of 10+ cycles per publication) are handled by emailing draft PDF files to reviewers. Often, feedback is received as handwritten notes.
- The IT department (busy doing a large SharePoint implementation) has not been involved as they regard the whole DITA thing as a limited, specialist application that does not relate directly to any line-of-business (LOB) applications.
To Sum It All Up
Good local results have been achieved, but the technical documentation department is (still) working inside a silo. Collaboration with other departments is largely done through email and PDF files. The technical writers are proud of their results but disappointed with the missing parts in their DITA vision. Management and other departments are impressed but also slightly alienated and mystified by “this DITA XML thing.”
Figure 2: The DITA “Black Box”
Scenario B: A Dramatically Different DITA Scenario
Now imagine something else. This scenario is not something I have met yet—at least not in a single company. However, all the components of the scenario are underway in a number of different companies that we are currently working with.
Figure 3: The Broken-Down Technical
- Anyone inside or even outside the company has direct access to the DITA content, controlled by a powerful and highly granular authentication system.
- Documentation project planning and management is completely integrated in the corporate CMS, based on the DITA map architecture. Topic writing tasks are assigned to writers, and the assignments are integrated with their Outlook task lists.
- All aspects of the DITA solution are completely integrated with corporate CMS workflows.
- The basic input for technical documentation arrives from three sources:
- Fact sheet topics and technical specification topics are auto-generated from the engineering databases through a connection to the company–wide CMS/collaboration platform.
- Engineers and Research and Development employees create the basic, simple topics using either browser-based fill-in-the-blank forms or special Word templates (no DITA XML tags are visible although the DITA architecture is used).
- Component manufacturers and sub-contractors provide DITA input directly into the corporate CMS, some through their own DITA tools, some using browser-based forms provided by the company’s CMS.
- DITA topics can be created and/or edited using any DITA compliant tool, including Arbortext, XMetaL, FrameMaker (DITA), a browser-based form, or Word, none of which requires any XML or DITA knowledge.
- DITA topics, maps, and images are stored in the corporate CMS, with
- DITA metadata directly available for both filtering and search
- Full versioning and check-in/check-out features available
- Records management and data vault archival completely integrated
- Any knowledge worker in the organization can easily find and reuse any DITA topic in a Word document, an Excel spreadsheet, an Outlook email, or in a PowerPoint presentation.
- Translators work directly (fully authenticated) in the corporate CMS, and translation cost estimates can be calculated on the fly, using the corporate translation memory.
- The review process now has two parts: detailed topic reviews and map reviews (for structure and completeness only). Topic reviewers can use the tool they prefer to work directly on topics. Tools include browser-based forms, Word, or traditional XML editors for those who prefer them.
- XML-based differencing (a.k.a “diffing”: Diffing is a file comparison that outputs the exact differences between two files, or the changes made to a current file by comparing it to a former version of the same file. Diffing displays the changes made per line for text files) is used to give authors a true picture of changes from one version to the next, not relying on manual change mark-up that may or may not have been turned on.
- Maps (and topics) are published in many different ways:
- Dynamically (no pre-rendering) on customer portals, intranet portals, and (LOB) systems
- As stand-alone HTML packages for field engineers
- As PDF, print, and help files (various formats as needed)
- As Word files, ready for post processing
- Directly integrated into the customer’s own DITA implementations (direct feeds of DITA XML source and resource files)
- Apart from the “normal” revision cycles, end-user topic ratings and comments are captured directly and integrated into the CMS.
To Sum It All Up
In this scenario (Figure 4), the full potential power of using the DITA global content standard is clearly coming through! DITA has broken out of the technical documentation silo, into the entire enterprise space—and even outside the company, with vendors and customers directly integrated.
Figure 4: The Full DITA Potential
After all, technical documentation is not really born within the techdoc department and it is not consumed there either. Many other people are involved, and this scenario clearly takes that into account.
For many participants, the DITA XML basis is now the “invisible hand” that makes it all work—and because it is invisible, it is no longer a “blocking issue.”
The auto-generation of topics from LOB systems, coupled with the direct topic feeds from component manufacturers further reduces costs. Fewer revision cycles are needed because of the improved (form-editor/Word-produced DITA) source materials received, and because of the much tighter and more direct revision workflows. The end-users of technical documentation are back in the loop, with a managed feedback mechanism in place.
Obviously, the DITA ROI has increased but, even more dramatically, time-to-market has improved. Think about it: If you could implement a similar scenario in your company, give or take a few elements, do you think it would be worth your while?
Microsoft SharePoint: Helping You Move From A To B
What is SharePoint?
First of all, we are dealing with SharePoint 2007 (or more precisely, “Microsoft Office SharePoint Server 2007”— a.k.a. MOSS). This version introduced many important new features, enabling my company to develop a DITA product on top of MOSS. You should quickly forget everything you know or have heard about the previous versions (well, maybe you should even forget most of the things you have heard about SharePoint 2007 too…).
SharePoint is a beast of many faces and many names. It is
- a portal system
- an enterprise collaboration solution
- a web content management system
- a social networking system
- a LOB dashboard/scoreboard engine
- an incredible mess (In most cases, this “tag” should be attached to the way SharePoint is being used by some companies rather than to the product itself.)
But first of all, it is an extremely open, scalable, and flexible enterprise content management solution.
I do not, however, intend to drag you through a long and tedious listing of all SharePoint features. If you need that, have a look at the SharePoint home page <http://office.microsoft.com/en-us/sharepointserver/FX100492001033.aspx>.
What I do want you to understand is that SharePoint provides all the features needed to develop the solution described in scenario B.
Just a few highlights of SharePoint:
- It is scalable from a local 2-person solution to a global 100,000-person solution.
- It has a wealth of out-of-the-box “combine-and-configure” features plus a rich set of development tools for developing your own special plug-ins.
- It has a powerful workflow engine.
- It can be connected and integrated to just about any other system, on any other platform.
- It is open—even though Microsoft Office (as could be expected) offers built-in integration, you can actually use any tool you want with SharePoint.
DITA Exchange™: An Example of a SharePoint-Based DITA Product
In this section, I would offer some examples from a sample DITA product developed on SharePoint: DITA Exchange.
When we talk about DITA Exchange to other people, we actually refer to it as a “complete SharePoint-based DITA content collaboration platform.”
Note the words “complete” and “platform.” On the one hand, the DITA Exchange product is a complete DITA solution that has tools and features covering the entire process from planning and managing a documentation project, from creating and editing topics and maps to publishing the finished information products in many different formats. On the other hand, it is also a platform that allows all kinds of custom enhancements and extensions, including using multiple DITA editors to create and maintain your topics.
Planning and Managing DITA Documentation Projects
Let us start by looking at planning and managing DITA documentation projects:
Figure 5: Planning and Managing DITA Projects in SharePoint
As illustrated (Figure 5), DITA Exchange lets you plan and manage a DITA documentation project through SharePoint. This includes setting up individual (topic) actions and letting the project participants report progress, as well as synchronizing the tasks with Outlook. Workflows can be attached to any task/topic as well.
We are currently working with CIDM to make the planning directly integrated with the browser-based DITA map editor (Figure 6), to make the process even smoother.
Figure 6: Detailed Task List
Writing and Editing Topics
As already noted, each user can actually use his or her favorite DITA Editor for creating and editing topics. However, DITA Exchange provides two built-in editors:
- A browser-based forms editor (Figure 7), based on SharePoint’s Forms Services. Only a browser is needed, and no prior knowledge of XML or even DITA is required to create a basic, sensible topic. This feature is particularly relevant for engineering or others creating an initial draft topic, as well as for reviews by non-DITA, non-XML people.
- Word 2007: Word users can choose to create and/or edit DITA topics using Word (Figure 8). The Word implementation includes a basic view (no XML visible) or an advanced XML editor view in Word.
Figure 7: Browser-based DITA Form Editor
Figure 8: Creating and Editing DITA with Word
All DITA topics are managed in one SharePoint list (Figure 9), with plenty of room for hundreds of thousands of topics. No “physical” folder structure is used because this always complicates issues when a topic is reused in multiple places. However, based on the DITA metadata, the users can dynamically, on the spot build their own personal “virtual” folder structure. Note that DITA metadata is integrated in the list which enables users to “filter” the list quickly on one or more metadata fields.
Figure 9: Topics List in SharePoint
If you select a particular topic, the drop-down action menu in Figure 10 appears.
It is worth noting that DITA Exchange automatically assigns a GUID (Globally Unique Identifier) to each topic. This ID makes it easy to exchange DITA topics with third parties because it is virtually impossible to get duplicate file names.
Note that you can view version differences (the only totally reliable way of seeing what has been changed) as well as start workflows for a topic.
Figure 10: Topics List, Drop-Down Menu
Composing and Editing a DITA Map
Inside SharePoint, using only a browser, you can create and edit DITA map file, with drag-and-drop editing (Figure 11).
Figure 11: Browser-based Editing of a DITA Map
Finally publishing (Figure 12). SharePoint makes it possible to create multiple “document converters” (publishing engines) for any content type, for example, DITA maps. In DITA Exchange, we currently have implemented four publishing engines in a pluggable architecture:
- DITA Open Toolkit: Use the Open Toolkit to publish to PDF, HTML, Eclipse, and so on.
- Word 2007: Publish DITA maps through the Open XML format to any kind of Word-supported document format.
- Custom: Publish as a stand-alone HTML site.
- Translation: Generate a complete translation package.
Figure 12: Publishing Through a Selection of
Apart from the traditional DITA publishing process, DITA Exchange includes a Microsoft Office-based tool, the DITA Exchange TopicPicker™ that allows any user, anywhere in the organization to search, filter, insert, and reuse DITA topics in any ad-hoc Word document, PowerPoint presentation, or Outlook email (Figure 13).
Figure 13: Reusing DITA in ad-hoc Documents
This truly enables DITA to serve anybody in the organization!
To make it short: I hope to have demonstrated to you that SharePoint 2007 has all the features needed to be used as the platform for an extremely powerful DITA solution, like the one described in the “B” scenario.
Leonardo da Vinci once stated that “simplicity is the ultimate sophistication.” For DITA, this is now more true than ever. Only an open DITA solution that can present both sophistication for the advanced user and a simple, easy-to-use interface for non-DITA, non-XML contributors will enable you to realize the full DITA potential.
SharePoint has both the “depth” of features and the friendly face to make this possible.
If you have chosen DITA, and if your organization has chosen SharePoint, you are in luck: You have a great starting point for something useful and important.
Based in Denmark, Steffen Frederiksen (M.Sc. Economics) is a co-founder and director of Content Technologies. Currently, he is heading the development of the next version of the SharePoint-based DITA Exchange product. He has worked internationally with XML-based content management solutions, software tools, and topic-based writing methodologies for more than 10 years (yes, topic-based XML content before DITA). He is a frequent speaker at XML conferences around the world.