Friday, February 8, 2008
Card Sorting – uncovering the mapping of the computer display of information and the user’s conceptual model of the information; each concept is written on a card and the users sort the cards into piles.
Expert Review – a formal review conducted by usability specialists according to common, pre-established usability principles.
Lab Testing – while being observed by usability specialists, users attempt to complete scripted tasks, which take advantage of the functionality of the system.
This post is going to be just a quick bulleted list of what I know my users want from a enterprise search engine, based on my study of our search logs, usability studies, profiles, recent research in the field, information retrevial studies and some conference presentations. If you have additional suggestions, please include them as comments.
- A single search box, persistently placed on all pages. Wide enough to avoid typos.
- Google - everyone knows google and wants internal search to be google. This doesn't mean the google appliance. This means quick, accurate and comprehensive search. Quick means sub 5 second response time. Accurate means the right document in the first 3 documents. Comprehensive means every possible document - regardless of which firm silo created it, regardless of which technology holds it, regardless of if "it" is a word document, a zip file, a email, a document on their hard drive or a documentum folder.
- Some kind of advanced search, even though they will not use it.
- An ability to narrow the search to specific areas of content that is contextual to them. Sometimes this is content types, like Policies, People, Sites. Other times this is my country, my service line, my language, my industry - taxonomy, but without having to call it taxonomy.
For the tool to allow them to type in "How do I do an internal audit?" and bring back documents on audit methodology. This doesn't have to mean natural language queries. If you ignore the "How do I do an" and the "?" that query is "internal audit". The system has to ignore ? and "How do I do" to run correctly. But it does mean that there needs to be a relationship between internal audit and audit methodology.
- People expect the system to find things using an
- not a PHRASE and not a . People looking for Risk Management in Technology Companies in the UK type in Risk Management Technology UK. The tool needs to understand that behavior and correct for it.
- People expect that the system will correct their spelling.
- People expect that documents with some of the words they searched for in the title will be more relevant. That documents with some of the words in the summary will be more relevant. That the system will know that TAS and Transaction Advisory Services are the same exact thing, even though they are not to the computer. People expect that more recent documents will be relevant, except when they are looking for older documents - and they want the tool to know the difference.
Those are some of the things that I know people want from search. What do you want from search?
Companies have been selling search engines for decades. IBM started with a product called STAIRS in 1960. Given that long history, you would think there would be a simple answer - do x, look at y and if it is larger than z, you have a good search engine. Evaluating a search engine is actually a complex question. Search is a very context sensitive behavior. In a knowledge environment, the documents that are of interest to you are not the same as the documents that are of interest to me. Search really can only be evaluated within a specific context for a specific user.
There is, however, a standard for search evaluation. The standard is based on a subjective, human understanding of relevance. The standard was set by the NIST. The NIST holds an annual event where they test and tweak search. They use a standard set of content, fairly small. They have experts evaluate the content and determine the ideal documents within that standard set. They then use queries supplied by those experts to bring back documents from the search engine. They measure how many documents within a number of results are from the ideal set - precision. They measure how many ideal documents can be found by the search engine - recall. This seemed like a reasonable process, so that is how we measured our impact on the search engine.
We established a test environment – a new instance of the search engine, indexing the production content. We asked for volunteers to act as our experts. We asked them to establish an area of expertise for themselves. We then had them identify the “top” 25 documents within that area of expertise, given a query that they suggested. This became our ideal set. We used that ideal set to measure precision and recall at 3, 10 and 25 results.
We also established another way of measuring the impact of our changes. We asked for real users to tell us how we are doing – have we improved, declined or stayed about the same - by creating a Beta site.
From this testing we determined that we could improve relevancy, and that these improvements would be noticible to the end user.
This process creates an emphasis on iterative design process. You need to develop applications in a prototype form, test the application with the end users – have them try to accomplish tasks using your prototype. Once you see the pain points, change the prototype to address those pain points and test again. Through a series of iterations you develop a product that solves real end user’s problems in a simple and intuitive way.
One way to describe a usable system is that it is easy to learn, easy to use, easy to explain and hard to forget. Think of a tool, software or hardware, that you enjoy using – which works well. Usability is a way of developing new tools that hit that sweet spot. It all comes back to measures – in order to make effective change you have to know where you are, know the effect of your changes and have a target goal in mind. We target improvements in task completion, task time, user satisfaction as measured by the System Usability Scale and a reduction in way finding errors.
I was a bit disappointed, really.