Bill Hackos, PhD
Comtech Services, Inc.

In the course of our consulting practice, we have worked with numerous taxonomy schemes designed to facilitate search and navigation to information on web sites. For several taxonomies, we’ve had the opportunity to talk to and watch users. I thought it might be useful to pass on some of the things we’ve learned.

The concept of taxonomy comes to us from Carl Linnaeus, the 18th century Swedish scientist who developed the scheme for organizing plants and animals. A typical taxonomy consists of a small hierarchical list of categories, each of which itself is a container for a small list of categories. Each of these is a container for a small list of categories, and so on. A non-scientific example might be the category “animal,” consisting of fish, amphibians, reptiles, mammals, and birds. The category “birds” may contain robin, blue jay, cardinal, and so on.

Here are some of the suggestions from users for creating usable taxonomies.
Taxonomies are much more efficient tools to find information than full text searches. Even a poorly conceived taxonomy is better than a full text search. Users find that they quickly learn to use even a poor taxonomy with a little practice. Taxonomies and full text searches work well together.
The most efficient taxonomies are “square.” That is, the number of highest level categories is the same as the average number of second-level categories per high-level category. The average number of second-level categories per first-level category is the same as the average number of third-level categories per second-level category.
For example, if there are five first-level categories, there should be an average of five second-level categories per first-level category or a total of 25 second-level categories. Likewise, there should be a total of 125 third-level categories. That way a user never has to choose among more than five categories.
If you have 64 metadata values, you should strive for a 4 X 4 X 4 taxonomy. For about 125 metadata values, you should strive for a 5 X 5 X 5 taxonomy. And for about 216 metadata values, you should strive for a 6 X 6 X 6 taxonomy. Experience has shown that it is difficult to achieve a perfectly square taxonomy. Generally, they are loaded toward the higher levels with too many first-level categories.
Users can easily handle moderately long lists as long as the category names are distinct. Even lists as long as 20 items are easy for users to negotiate if they understand the terminology.
Taxonomies work best in disciplines in which users agree on terminology. They work well for scientists, less well for diverse users such as citizens using a city government site where members of a diverse community may not know legal, financial, or departmental terminology.
The keyword list should be complete at each level. We have found that users like to know they have reached the right category even if no data for this category is available in the site database. If they don’t get this information, they are tempted to keep trying other paths.
Terminology and grammar should be parallel in category lists. For example, you would never put the words ”Bird,“ “Eagle,” and “Golden Eagle” in the same list. This seems obvious, but we see it all the time, and users are confused.

The following are some helpful references on taxonomies:

Unlocking Knowledge Assets: Knowledge Management Solutions from Microsoft
by Susan Conway and Char Sligar

Taxonomies: Frameworks for Corporate Knowledge
by Jan Wyllie

Taxonomy and Content Classification
A Delphi Group White Paper

Information Intelligence: Intelligent Classification and the Enterprise Taxonomy Practice
A Delphi Group White Paper

Developing and Creatively Leveraging Hierarchical Metadata and Taxonomy
by Christian Ricci