Taxonomy vs. Controlled Vocabulary

With the recent announcement of “DAM and the Tao of Taxonomy,” which is the next webinar I’ll be hosting, I’ve been getting questions about my take on the differences between taxonomies and controlled vocabularies. I’ll offer below the most basic distinction I can muster, but for much more detailed information from a true authority on this subject, please join me for “DAM and the Tao of Taxonomy” on August 29, 2012. My guest will be David Riecks of David has won a US Library of Congress “Pioneer of Digital Preservation” award and he’s also a judge for Createasphere’s annual DAMMY Awards.

I’ve been impressed by David’s knowledge and DAM common sense since the first time I spoke with him years ago. There’s no way you won’t come away from this webinar without learning. We love David.

Okay, let me put Taxonomy vs. Controlled Vocabulary in simple terms: When I think of taxonomy, I think of structure, organization and hierarchical “place.” In other words, if my digital asset was a physical object, my “taxonomy” would describe where I might store that object. Example:

  • Home
    • Kitchen
      • Pantry
      • Junk drawer (we all have one, right?)
      • Refrigerator
    • Bedroom
      • Closet
      • Under the bed
      • Secret location that’s none of your business and you’ll never find it
  • Work
    • My office
      • File cabinet
      • Bookshelf
    • Conference room
      • Storage cabinet
      • Center of the big shiny table

The taxonomy of my life’s structure would help me decide where the object belongs. For example, there is no entry in the hierarchy for “Airplane,” and that’s because DAM Survival Guide needs to sell a few more copies before that will be added.

Keep in mind that in the digital world, an “object” can fit in many different locations, so we assign it everywhere it fits. The point to our chosen taxonomy is that there is a pre-defined structure that we use consistently across our organizations or industries. After all, one person’s “secret location that’s none of your business and you’ll never find it,” might be another person’s “place I should have hid better from the kids.”

By this time you might have correctly guessed that my “controlled vocabulary” is the list of keywords I use to describe the objects of my life. (Taxonomy is where they go; keywords are what they are.) A “controlled vocabulary” is just some technological way of limiting your keyword choices to only those previously approved for use. This way, your metadata editors and users don’t have to guess whether you were thinking of an “airplane,” “aircraft,” “jet” or “big flying thing.”

When used together, taxonomies and controlled vocabularies enable you to design a killer, well-organized DAM that will be so much easier to use and maintain.

Keep in mind, this is how I see the differences. Others use taxonomy structure no differently than they do keywords.

Join David and me for the “DAM and the Tao of Taxonomy” webinar and find how why my explanation above only scratches the surface of possibilities for these methodologies and technologies.


  1. David, what if you have a fridge in your office as well? I would group your example like

    Storage Type
    – Pantry
    – Fridge
    – Closet
    – Bookshelf
    – Junk drawer
    – Work
    – Home
    – Sport
    – Kitchen
    – Bedroom
    – Office
    – Conference room

    With such a structure I don’t have to know that you have junk drawer in your kitchen. I just look up for junk drawer. If your junk is organized, I would find it with the combination of junk drawer and kitchen. But maybe also with junk drawer and office.

    If you get an airplane, you just add it to place. And I’m sure, the airplane has a cool beer in a fridge too. So the beer will be added to airplane and fridge under storage type. Maybe you will need another item to find the beer?!

    – Beer
    – Coke
    – Soda

    Such a structure is very powerful and flexible and can be adopted to the customers need in no time.

    1. Helloooo Jacques! Great points, and I like the way you think. This offers a great example of why it’s so important for an organization to really think these things through–there are so many ways to organize things, and people see things differently. Now, where I would jump at an organizational structure like yours is if I had fully functional faceted searching at my disposal. In that case, for example, your structure would permit me to have a virtual location (paradise), such as a refrigerator in the bedroom. But this does add another layer of complexity that could cause user confusion if the labeling isn’t done with careful precision. Example: In your example, “Kitchen” and “Bedroom” are both “Places,” which makes total sense. But wouldn’t I consider my “Home” and “Work” to be places too? For that matter, the “Bookshelf” and “Closet” are also places.

      The key, I think, lies in knowing how your user base thinks. For organizations that, for example, are used to “canned” taxonomies that reflect known subject matter, flexibility like this might do nothing more than confuse everyone. Consider animal appendages:

      Living Things
      – Human
      – Canine

      Stuff that Sticks Out
      – Leg
      – Tail

      Using the Jacques super-killer-flexible structure that imposes no limits, you might have people searching for Humans with Tails. Finding only lawyers and politicians (and a few marketing people), those users would be confused. But if “Tail” was a subset of Canine that didn’t appear under “Human,” you’d immediately know that “Humans with Tails” wasn’t a viable combination of terms.

      But with regard to the beer in the airplane, Vince will tell you, it’s “8 hours from the bottle to the throttle” or else no flying for you. 😉

      Thank you for the fantastic perspective on this!

  2. Totally agree – the whole topic is complicated and as to be thought through. I had many emotional discussion with clients about this topic. For many users it is hard to let go of THEIR logical structure. But their logical structure works for them, the marketing department but not for the journalist or the service man on its way to fix the coffee machine. They just think and search different.

    The key for this “All in one Box” solution is that you name your main tags the way the user is not confused. In fact, those few main tags should guide the user. In this example “Place” was indeed not a good choice.

    All of us are using this kind of “All in one Box” if searching something on Amazon, iTunes or any other webshop. You might search for Movie, Actress, Comedy and you could end up with a movie with Jennifer Lopez. But you could also search for Music, Latin and Female …… Jennifer Lopez will be there as well 😉

  3. Hi David,
    Really great simple explanation of the the differences between taxonomy and controlled vocabulary. Always great to keep it simple.

  4. I agree that a ‘controlled vocabulary’ is defining the terms used in an organization to refer to an object. It’s basically the tag to be used for something that I could refer to with other terms as well.

    I also agree that taxonomy (preferably) puts terms of a ‘controlled vocabulary’ in a hierarchical relationship, broading or narrowing the term along the line of hierarchy. The design of these controlled vocabulary terms and their hierarchical relationship makes some solutions fail and others successful and sustainable.

    Another term frequently used is ‘thesaurus’, which one can separate mainly into hierarchical and equivalent thesaurus (and to a lesser used extent into associative ones).

    By definition, a ‘hierarchical thesaurus’ isn’t really different from taxonomy. But an ‘equivalent thesaurus’ is different – and its availability makes a solution extremely more powerful (I think you meant that when talking of the “airplane,” “aircraft,” “jet” or “big flying thing”):

    The recently joined editor who enters ‘aircraft’ will automatically tag the asset with the controlled vocabulary term ‘plane’ too because this is the controlled vocabulary term defined to be used for ‘aircraft’ – besides others.

    And the searching user will find that plane asset too when searching for “big flying thing”.

    1. Great point, Ramon. The concept of a DAM being “smart” when it comes to the application of synonyms is one of the areas I hope we see DAMs head in the near future. This not only opens up possibilities for same-language synonyms, but this could also be a “gateway drug” to automation translation services. Right now, translations can be so horrible because, without proper context, the engine chooses wrong, often laughable terms. But with this semantic context in place, we might see some serious improvements in that area. I have a feeling David Riecks will be speaking to this issue a bit during our webinar later this month.

  5. Excellent point, David and thanks for the discussion topic. It seems to me like there are a lot of benefits to being able to take a flexible approach to Taxonomies and/or Controlled Vocabularies. It may be important to structure the language used to describe digital objects, especially when dealing with really large data structures. Scientific datasets might fit this category because of their size and their audience. It wouldn’t be good to have people adding a plant species somewhere into kingdom Animalia, for obvious reasons. On the other hand, for the search audience, students, for example, might not be able to know where to go, even with a faceted search interface, so it would be useful to have a Google-like search that was smart enough to know common names and other ways to find what someone is searching for. Ideally, it seems there should be multiple ways to serve multiple audiences. The trick would seem to be finding a way to easily guide the user to the search method that best meets their needs.


Comments are closed.