Andrea Stevens, Comtech Services, Inc.

The information age is upon us. We have five times the number of words in the English language today than existed in the time of William Shakespeare. In the year 1900, human knowledge doubled every 100 years; today it doubles every 13 months. Even our economy has undergone a shift: we have moved away from an economy based on traditional industry to an economy based on digital information. It would be relatively easy to fill this article with statistics about how much technology has impacted the way we create and present information; however, as information architects, we also need to concern ourselves with the way we manage that information.

Think of a game of 52-card-pickup, particularly when you’re playing it with someone who has no idea what the game is. You have the cards ready to go, and the other “player” is waiting for a rundown of the rules, and bam! You let the cards fly. It’s especially satisfying when the cards launch into the other person’s face, and she’s sitting there looking shocked up until she starts to look angry and you remember you’ve been playing this game with your mother. Then it becomes less 52-card-pickup and more of a race for your life as you sit through a lecture about why valid card games do not involve flinging projectiles.

Information is like that: we, the information developers, have the information customers want. Customers wait expectantly for the information to be presented so that they are able to navigate at their leisure in accordance with their own experience with the topics in question. Instead, what they often find themselves facing is the aftermath of a game of 52-card-pickup, which they weren’t even present to play.

A common and effective solution has been metadata. By tagging information, we allow for customers to search by keywords and find content, but what if the users don’t know exactly what they’re looking for or don’t know exactly what terms to use in defining search parameters? How can we get them through all the information we produce and to the exact item they need?

It comes down to organization. In my primary function as a high school teacher, I play a game with my debate students to teach them how to take organized notes and recognize the importance of developing a shorthand system. It is, essentially, an extended exercise of “pick-a-card.” I shuffle the cards, remove one and keep it hidden, and then proceed to read out the cards one at a time at varying speeds. The goal is for students to identify, through the cards they write down, which card has been removed. The first time through, they don’t know what to write or how to organize themselves. The second time through, however, most of them have developed a system that ultimately defeats the point of my exercise, but that perfectly demonstrates the need for information organization: they create a table. They have the suits and the individual card values and as they hear each card read, they check off the card on their table. Only one card can fill the value and the suit exactly, just as only one piece of information can fulfill a user’s exact needs.

That one piece of information – that specific card in the deck – has metadata that describes it: Bicycle, Hoyle, red, black, suit, value, and so on. We all know about using metadata to tag information to help people search. What remains is to push that metadata into a physical organization form as shown in Figure 1.


Figure 1: Metadata in physical organization form

At the most basic level, we can, and probably do, organize our metadata using a controlled vocabulary: a predefined list of terms selected by the information designers. The problem is that people outside the organization don’t necessarily know what terms to use. We have to allow for synonyms, and while a controlled vocabulary does allow for synonym use, it only allows for one synonym per term.

Going up the complexity scale presents an advantage in organizing information. Taxonomy, thesaurus, and ontology are all very similar organizational structures, and, depending on the needs of your organization, one may be better than another; however, since these structures do follow similar principles, I’m going to focus on the simplest of the three: taxonomy. Taxonomy refers to the hierarchical organization of information and words that people search for. We have preferred terms that can be dictated by our industry, our geography, and even our corporate identity. Despite these prescribed factors, how to categorize and how to present a taxonomy remains open to interpretation and key decision-making factors.

One of the first decisions you must make about taxonomy is what form you wish to use. There are six common taxonomical forms you might choose to use both in your organization and in the interface the person sees. The most common taxonomical form is a hierarchy, which is a highly structured tree that provides many broad categories, more focused sub-categories, and numerous final “destinations” of the individual topic a person is looking for (see Figure 2).


Figure 2: Sample hierarchy

A hierarchy can be an effective tool, especially for a user interface. In fact, we interact with these types of hierarchies on a regular basis. Think of an online shopping catalog. Right now, in the midst of summer, a store might choose to display “summer” as a category of items that includes broad sub-categories: pool toys, outdoor toys, patio furniture, and patio accessories. As a shopper, you might wonder what the difference is between pool toys and outdoor toys. Aren’t many pools outdoors? Is there overlap? Is there not?

With a strict hierarchy, there will not be any overlap. Consider the example hierarchy tree shown above. A heart cannot be a diamond; the Ace of Hearts is always the only Ace of Hearts, unless you’re cheating. In the case of the online shopping catalog, if you’re the organizer of the information, you have to decide if pool toys are different from outdoor toys and not a sub-category of outdoor toys, and you need to consider the assumptions a shopper would make. Where would a shopper look for a squirt gun? Yes, a squirt gun makes a fun pool toy, but is it pool-specific? Probably not. Therefore, you might put it under outdoor toys instead.

Something like a squirt gun, however, might not be best categorized as just an outdoor toy or just a pool toy. It might be both. In this case, a poly-hierarchy might be useful. A poly-hierarchy allows for the overlap of items, recognizing that different people and maybe even different competitors might organize items differently than you. For example, if I need to buy tuna for my cats, I might choose to include it under the category of fish or seafood, in turn putting that under the broader category of meat; however, I might also choose to put it under the category of canned meat, which belongs to the canned food category. This one item has two approaches to finding it, and a poly-hierarchy allows for a customer to find information using multiple approaches. A poly-hierarchy can decrease frustration, but might lead to greater confusion within your own organization. You have to be on top of your tagging procedures and tag information multiple times. If even one tag is missed, your customers may miss critical information about your service or product if they don’t know other categories it might fall under.

Because hierarchical forms are not good at showing gaps in inventory, a hierarchy might not be the best for producers of information. Instead, you might consider using a different taxonomical form, such as a matrix.

A matrix works to compare a few specific descriptors of documents. You indicate where the topics you have fit on the matrix, and you learn where you have gaps in specific pairs and small groups of characteristics. Matrix building allows you to decide if you’ve overlooked a specific set of customer needs. Such analysis and organization of a taxonomy can lead to decisions about future strategies for your company, especially concerning competitors in your industry. For instance, if you are in the automotive industry, you must be aware of what vehicles you offer compared to other makers (see Figure 3).


Figure 3: Matrix sample

The matrix might be really useful for you as a company, but when it comes to looking at what the customer interacts with, it may not be best for examining a large quantity of characteristics at once. If your products, services, or documentation fits many categories and has been tagged with many attributes, organizing by facets is probably the most effective taxonomical form. Facets allow customers and employees to search for the information or sort the information by those attributes. While establishing facets may be more work for you initially, in that you must tag information many times with completely unique and mutually exclusive attributes, it produces a customizable feel for customers who have different jobs, roles, and organizational priorities.

Facets work in conjunction with hierarchies to change the customer’s interaction with them. Each facet serves as a parent grouping for items that share that attribute, rather than limiting each item to only one parent. Let’s go back to the squirt gun example. I can click through the hierarchical list, or I can use facets to bypass clicking through the list and instead use the metadata tags to my advantage. After searching for “squirt gun” and getting a list of results, I can choose to filter and sort these results to find exactly what I’m looking for. If my purpose is to give party favors, then I will filter my search by the price facet. On the other hand, if I’m hoping for an epic water battle, I will filter by popularity or guest rating (see Figure 4).


Figure 4: Sample facet

Hierarchies, poly-hierarchies, matrixes, and facets are all valid taxonomical forms, and all have pros and cons, depending on your organization, your customers, your deliverables, your industry, and even your subject matter. This list comprises the five branches of taxonomy development and influences what taxonomical form might be used for that particular branch.

Take the case of customers, for instance. Customers are likely going to make up a fairly short and specific group. The variations in customers and customer roles might best be expressed as a list, which, as a taxonomical form, is not really appropriate for use with large quantities of information. However, for describing your customer roles, a list could be perfect since it is easily scannable and focuses on giving the gist or highlights of the information at hand. On the other hand, given the breadth of particular industries, a poly-hierarchy system may provide the best access to terminology used across the globe within that industry, such as finance or medicine or telecommunications. Of course, the way your company fits within the industry will change the terminology your taxonomy expresses, but you will still need to identify those terms, guiding the customers to your preferred terms in the information.

Whatever your industry and purpose, taxonomy can drive usability and increase customer satisfaction. The different branches of taxonomy development may dictate the exact words, terms, and categories your company might use for organizing information, but all this tells you something you already know: information is complex, and metadata matters now more than ever. After all, no one likes being the person who has to clean up after a game of 52-card-pickup. Not even with squirt guns.

Whether your organization uses DITA or a form that adheres to other standards, such as OWL and RDF, Comtech is pleased to offer a comprehensive introductory course in taxonomy development. This course covers the differences between taxonomies, ontologies, and thesauri; basics of taxonomy design; the five branches of taxonomy; testing your taxonomy; and even some useful resources to consider as starting points when making a build-or-buy decision.