Round Table Highlights: Taxonomy Development

Sabine Ocker, Comtech Services

We are sharing 2.5 quintillion bytes of data  on the internet daily. According to the IBM Marketing Cloud Study, 90% of the existing content on the internet has been created since 2016. We have heard this sheer magnitude of new information available to us each and every day referred to as the “content tsunami.” Emerging paradigms such as the Internet of Things and interactive content access such as chatbots will only continue to accelerate the volume and velocity of the giant wave of new information. Given this reality, the user’s challenge to find and filter information quickly and effectively increases exponentially, no matter the context of the information they are seeking. As taxonomies are an integral part of effective finding and filtering content, so their significance is only expected to grow as well.

During the December CIDM managers roundtable, members talked to us about how taxonomy usage and development fits into their existing and future content strategy.

Only a small number of managers spoke during the session, but they represented companies with a breadth and depth of experience implementing and managing taxonomies in their organizations. Of the 15 participants on the call, it appears that many joined to “listen” and “learn.”

Since each of the member companies whose managers contributed to the conversation already have an ontology strategy in place, they spoke about what business drivers compelled them to adopt a taxonomy, what challenges they encountered during design and implementation, and to tell us a bit about their tooling environment.

Business Drivers

One CIDM member company has both a corporate product taxonomy and business unit-specific taxonomies as well. The acquisition of three additional companies meant she had to revisit the classification deployed to their new documentation web site to align the disparate functional areas and types of documents. It took her nearly a year to get agreement between the business units on how they would define a product and to address the difference in granularity (some included the product release value in the taxonomy, while her organization does not). Although the discussions are still ongoing, the BUs have agreed to map alignments between the products and releases.

Another member company has a full ontology and thesaurus strategy deployed to their documentation site, which they implemented over a year ago. Because of the thesaurus capability of their ontology management tool, they are able to create alignment between disparate product names (some product groups utilize acronyms or refer to the product by a previous name) by creating different labels, one that a writer sees when they are tagging content and one that users see when they access documents on the site. Their strategy is to start with technical product documentation and then roll out the ontology to other areas across the enterprise in the upcoming months.

Benefits and Purposes

According to one member, deploying a product taxonomy as metadata in the authoring environment has enabled one-click publishing, resulting in a massive process efficiency. Previously, writers were forced to copy, upload, and download files into multiple locations in order for publications to appear successfully on their delivery platform. Now writers simply choose a single value from an enumerated list in order to populate the content onto the documentation portal.

Another member told us about how their taxonomy allows writers to find existing content in the content management system for reuse. Also, the same search and retrieval mechanism allows the organization to easily identify duplicate content within the repository.

One key benefit mentioned is that tagging content by business-critical metadata such as industry, subject, product, product family, and content types facilitates effective search query retrieval and filtering via facets on the delivery platform.

An interesting future benefit described by a member is the notion of training an artificial intelligence (AI) engine to auto-classify content according to a rigorously defined set of taxonomies in order for a chat or other bot to be able to deliver multiple microcontent responses to a user as proactive assistive content. In this scenario, the entire content corpus becomes in essence a part of the interactive bot experience, where the bot delivers not just a single piece of microcontent in response to a query, but a series of pieces of information gathered from disparate parts of the entire support or product content. Think of how a user might need to execute a sequence of steps within a task and in doing so might use different products, or the information they need might reside in multiple publications.

At least one member uses the taxonomy to track and measure the overall success of the content on their site. Rather than using a single metric, such as a Customer Satisfaction score, click-through rate, or search query value, his organization breaks down what they measure for different kinds of content by creating what they call “recipes,” consisting of a list of 3-6 “ingredients” or metrics that are applied to content sliced by industry, product, or content type to determine how well that pre-sales or post-sales content is meeting the customer needs.


One member articulated her main challenge as the fragility of her site’s ontology tool. Taxonomy management tool vendors tend to be smaller companies without global 24/7 support available to their customers, resulting in downtime as the opportunity for engagement with the vendor is limited to a short window of time each day. The member company translates the taxonomy into twenty languages, and some issues have been with the vendor language packs and back-end auto classification glitches.

Two members brought us up to speed on the challenges they experience with alignment across business units or other content domains, especially when it comes to terms and product labels. One member described it in this way: “The larger and more distributed your organization is, the more silos you have to cross-pollinate and align with.” Eventually, the solution will require pulling together and then pushing alignments into each respective system and tools.

Agreeing upon what is the definition of a product is a challenge experienced by another member. In her part of the organization, hardware products use software to access features and operate functionality, but some other parts of the company view software products as separate categories from hardware. She invested a lot of time communicating with other functional areas and was able to get agreement with her definition that hardware products can contain software products in the new product taxonomy. This product within the product classification drove the facets on the documentation portal, which now facilitate the finding of the appropriate software documentation quickly and effectively.

Implementation and Tooling

Interestingly, each of the three members who contributed to the conversation have taxonomy and metadata management tools that sit between their content management system and their delivery platform. Two companies have home grown tools for management tasks such as DITA subject scheme generation, metadata changes, and the extraction of metadata for processing upon publication. These tools provide sophisticated management, and they bypass challenges associated with changing metadata fields in a CMS. One company has automated the metadata tagging process, so writers do not ever have to apply values to their content.

During the design phase of their respective taxonomy development projects, these members utilized a visual representation of the taxonomy in the form of a mind map, knowledge graph, or web visualization. Each said having a non-XML version of the hierarchy was singularly impactful to getting the buy-in and input they needed from their stakeholders.

We had an interesting and engaging roundtable discussion. Designing and implementing a taxonomy is a project with a great return on investment, but as we learned from our three members who have gone through the process, it takes time, commitment, engagement at the executive and stakeholder level, and lots and lots of communication. If you want to jump into the taxonomy waters, one member suggested that getting alignment on your organization’s content types is a great place to start. Comtech Services and other CIDM Members are here to help.