Originally posted: Wed, 13 Oct 2004 06:26:10
I teach a usability course nearly every year -- this time it was on accessibility and usability -- and every year I dither about what book to use. Should I use Carol Barnum's book Usability Testing and Research, a book for which I have much admiration, even though I contriibuted perhaps the most awkward recommendation statement ever? Should I instead use Dumas and Redish's A Practical Guide to Usability Testing, a venerable old tome? Or should I pick up something by Jakob Nielsen?
This year I returned to Jeff Rubin's book, which has been around for a while but still seems to have the most concise and readable discussion out there. It doesn't have the high quality of examples that Barnum has, nor the up-to-date cachet of Nielsen's work. But it does get into the guts of usability methodology, and in a more rigorous way than the others do. And since my students have a usability lab for testing websites, something with a focus on methods seemed like the best choice.
Rigor, of course, is not something I normally associate with usability testing. Contemporary usability tests are descended from orthodox experimental psychology (in fact, Rubin's training was in that area), and it used to be that a "real" usability test was essentially an experiment oriented to a product, with N>=8 for each condition, multiple conditions, etc. (e.g., the Bellcore work on SuperBook). But different encrustations have developed, the most noteworthy being that testing often occurs with relatively small Ns and often no alternate conditions at all. In describing why the small N has caught on, Rubin repeats a classic fallacy first perpetrated by Jakob Nielsen. Let's call it the Easter egg fallacy.
The Easter egg fallacy goes something like this. Any given product has a finite number of problems with it. Chances are that you won't find every single problem with a single test, since users have varying levels of experience and varying backgrounds. But you'll find a good number of them. With N=5 or 6, you'll find upward of 80% of the problems. So it's not necessarily cost-effective to run tests with larger Ns -- at some point you're bringing in people who aren't really telling you anything you don't already know. Better to run multiple iterative tests with small Ns.
I don't have anything against multiple iterative tests with small Ns, but I do have a problem with the underlying logic. The assumption is that problems are hidden in the product, like Easter eggs are hidden on a dewy green lawn. How many can we find? But problems are not simply hidden in products; problems are problems because of how the product is enacted in a given environment by given actors. When programmers say with only a little irony that "it's not a bug, it's a feature," they're onto something: problems can be shifted out of their frames, applied differently (not necessarily even idiosyncratically), and can become solutions to still other problems. I recall a study sometime back -- by Rosson, I think -- of an email feature that included a bug that let people do something they wanted to do. That bug actually did turn into a "feature" in the next release, with not much more change than a write-up in the documentation. What I am saying is that the Easter egg fallacy is a structuralist reading, and it doesn't work as well as a constructivist one.
Most usability books have a version of the Easter egg fallacy. But to Rubin's credit, he does encourage larger Ns and multiple conditions, and he does take other measures to ensure rigor (as much as you'll get in a usability test, anyway). His writing is characteristically terse and parsimonious, and that extends to his examples of test plans and test reports. For the money, this book gives a quick, readable, and surprisingly complete introduction to usability testing.
Blogged with Flock