Sharing Content Across Departments using DITA
Many teams produce content similar to, or even the same as, content produced by teams in other departments within the same company. The promise of the Darwin Information Typing Architecture (DITA) as outlined in the DITA Maturity Model is that you’ll be able to reuse XML content to avoid redundant work and improve efficiency. However, to realize this goal, you must address four key considerations covered in this article.
As with any initiative, to be successful, you must know what you want to achieve. In addition to identifying the goals, your goals must be clear and quantifiable, realistically achievable, and agreed to by all participants.
Goals: Clear and Quantifiable
How many projects have you seen fail that have vague goals, such as “We want to save money” or “We need to be more efficient”? Although these goals state intentions, they do not define success.
To define success, the goals must be clear and measurable. To clarify the goal to save money, you must specify an amount of money or percentage of change in costs and indicate the time period in which the cost will be reduced. Figure 1 shows an example of how to quantify a general intention into a measurable goal.
Figure 1: Clear, Measurable Goal
This goal is measurable because it clearly identifies what is being measured (cost of producing online content), the teams involved (technical support and technical publication departments), the anticipated change (10 percent cost reduction), and when the change must occur (Q2 2008).
After you clearly define the goal, verify that it is realistic. A clear, unachievable goal does not position you for success; instead, it sets you up for failure. Goals can be unrealistic in various ways. For example, if the stated measurable goal is to do 10 times as much work with half the resources in the next quarter, then it is certainly measurable. However, it is unrealistic in that it sets criteria for success too high and does not provide enough time to realize significant change. Figure 2 shows an example of making an unachievable goal realistic. This goal is realistic because it specifies a reasonable amount of measurable change.
Figure 2: Realistic Goal
People reuse content for a variety of reasons. For example, some teams want to produce more deliverables in multiple formats; others want to consolidate redundant information and reduce the number of deliverables. Do not assume that all the stakeholders on the project share the same goals. Take the time to state the goals and validate that all project participants want to reuse content for the same reasons. If the goal for your team is to reduce the number of online deliverables by 25 percent in Q3, make sure that the other team does not have the goal to increase the number of deliverables for the same time period.
Process is a series of actions, changes, or functions bringing about a result. In the context of content creation, consistent process implementation helps authors create consistent source content.
Process: Documented and Followed
If you have a content authoring process that is not documented, the authors are not following it. Authors do what seems to work best for them to deliver the end result. It also does not count if you do have a documented process, such as for ISO 9000 compliance, but the content authors are not following it. Without a clearly documented and easy-to-use process, authors will create their own ad-hoc processes.
Content reuse is dependent on consistent content creation. Consider these key process milestones when trying to share content:
- When is structure ready to share?
- When is content ready to share?
- How will you communicate about milestones?
- How will you share the structure or content?
Stating key milestones does not mean that all content authors must follow the exact same process—rather you must identify the milestones where the different processes intersect and define the required criteria for moving to the next phase in the development lifecycle.
One of the key dependencies that content authors must understand when creating DITA content is that they must write the content with reuse in mind. They cannot include contextual references, such as “in the previous chapter,” because the topic may have different neighbors in different deliverables.
In addition, authors must use the appropriate elements in the content to produce properly formatted output. Because XML separates content from format, corporate styles of, e.g., bold text for window titles, are dependent on the author’s use of the <wintitle> element wherever a window title is mentioned. Without proper use of XML tags, the proper formatting is not applied when you generate the deliverables.
Lastly, content must be structured for reuse, which means that topics must be written at the appropriate level and collections of topics that are candidates for reuse must be organized into maps. If you have a topic that contains ten lengthy sections and you are storing topics in a content management system (CMS) that restricts content referencing to entire files, you cannot easily reuse content from that large topic.
Process: Stakeholders and Participants
Stakeholders are people who have an investment in making a process work. In contrast, participants are the people who have a direct investment in the process and actually do the work. In the scenario of reusing content, stakeholders include the manager calculating the percentage of savings her team achieves by reusing content and the information architect who specifies the reuse strategy. The authors who create and share the content, as well as the deliverable specialist who creates the deliverable maps, are participants.
To successfully share content, it is critical that stakeholders and participants understand their roles in content creation and reuse processes. For example, if the process states that authors must reuse product names from a master directory of product names, then someone must create and maintain that directory as well as communicate to authors best practices for name reuse.
The technical criteria for reusing content are simple. The content must be valid XML, it must meet the agreed-upon quality standards, and authors must be able to find it.
Criteria: Valid XML
DITA is a topic-oriented architecture, and each topic type has a specific XML structure. For example, a task topic does not allow you to have a step without including a command. To easily guarantee that the XML is valid for each topic type and map, use an XML editor that does not allow authors to create invalid content.
In an ideal situation, all the parties who are to share the content agree upon a consistent architecture. If all the parties are using DITA, then you know that the structure for each topic type is consistent. If you cannot all use the same architecture, agree upon a limited list and enable a process that maps the elements of structures that differ to their corresponding elements in DITA and apply a transform to create valid DITA topics and maps.
Criteria: Quality Standards
To create content at consistent level of quality, all the parties must agree on quality standards. In particular, it is important that all project participants measure quality the same way. For example, if one team measures quality by checking for spelling and another team requires that the content be edited for conformance to minimalism standards, then they cannot easily share content. As with processes, you do not need to have exactly the same standards, but you must agree on what is acceptable and what is not, as well as how you can measure it.
The primary rule for storing content is that authors cannot reuse content that they cannot find. If all the content is in the same repository, such as a source control system, but the system does not have appropriate search functionality to aid in content retrieval, then the authors cannot easily share content.
If the content is stored in multiple repositories, but you have mechanisms in place such as federated keyword or metadata search, then authors are able to easily find the content to reuse. One important issue in trying to share content from multiple repositories is to ensure that authors have appropriate access permissions for each repository.
Information architecture for reuse means identifying at what level to reuse content, determining when to reuse exactly or conditionally process content, and identifying content that should not be shared at all.
Architecture: Reuse Level
DITA enables reuse at every level: map, topic and element. Consequently, your reuse strategy must address best practices for successfully reusing content at each hierarchical level.
To reuse content at the map level, you must understand and communicate the role that maps play in organizing content. To gain the most potential reusable content, create small maps containing sub-collections of topics that logically travel together. You can then easily include these maps in multiple deliverables and apply conditional processing if you need to produce variations of the output.
To reuse content at the topic level, create a plan by topic type. For example, you may determine that it is appropriate to reuse concept topics that cover common technologies, but that all the task topics are too specific for easy reuse.
To reuse content at the element level, identify which elements will be available for maximum reuse. For example, if you want to reuse product names from a master directory, then you must agree on the element to use for product names, such as the <ph> element. DITA supports element reuse with the conref attribute, which allows you to specify an element’s content by reference to the same type of element somewhere else. If you are using a CMS, be aware that some repositories do not support referencing elements from within a topic. You must store the element as a separate file.
Architecture: Conditional Processing
After identifying information that can be reused without change, identify information that can be used in part or with minor changes. To use a subset of the content, you can identify elements using conditional attributes during authoring and set the appropriate conditions when you generate output. You can apply conditions at the element, topic, and map level.
For example, in a project, you may have a map that includes a submap that is specific to Product A, a topic within another map that applies only to Product B and a paragraph in a topic that applies only to Product C. If you apply the product attribute values appropriately, when you generate the output for Product A, the generated deliverable contains only the content applicable to Product A. DITA provides conditional processing with three default attributes, and DITA 1.1 supports specializing the “props” attribute to add additional custom variables.
Architecture: Un-sharable Content
The old adage “just because you can, doesn’t mean you should” applies to reusing content. Although you may have access to all available content, not all content has the potential for successful reuse. For example, content created to provide flow for linear deliverables, such as books, may include context-specific information not applicable outside of the deliverable.
When all the parties who want to share content can agree upon the same goals, consistent processes, reuse criteria, and scalable, reliable architecture, then you can share content. The good news is that DITA can provide the initial architecture for content reuse and scale to become the common semantic currency for content interchange across your entire enterprise. As the DITA Maturity Model explains, you can leverage DITA at all levels of your company to reuse content, starting with topics.
Initially, when you share content across departments, you start with a subset of your entire content set. However, as the amount of content authored in DITA grows and more teams specialize DITA to meet their authoring needs, the amount of content that you share and the manner in which you share it changes.
Amber Swope is a Principal Consultant at JustSystems where she applies her information architecture and DITA experience to help clients address their content-related business challenges. Amber is an experienced information architect with almost 20 years in the information development field. At IBM, she led the first HTML to DITA migration project for the Rational division and implemented DITA in a production environment. Amber is a member of the OASIS DITA Technical Committee and participating on the Learning and Training Specialization subcommittee. Amber has authored numerous papers and articles on information design, development, and architecture and presented at leading industry conferences.
Amber Swope and Michael Priestley; DITA Maturity Model Whitepaper; JustSystems, Inc. and IBM Corporation, 2008.