Enterprise Search and Usability: 2008

Wednesday, November 19, 2008

Great article

This has nothing to do with search or usability, but it is the best description of the problems we have in Wall Street I have seen yet.

http://www.portfolio.com/news-markets/national-news/portfolio/2008/11/11/The-End-of-Wall-Streets-Boom

Monday, November 17, 2008

I just had an experience with a feedback form that failed. I'd like to give you some background, discuss the failure, it's impact on me and challenge you to look at your own user experience and how you may be doing the same thing to your users.

I have AT&T as both my telephone and high speed broadband provider. They have a new product called Advanced TV that I am interested in. I'm currently a DirecTV / TIVO customer and am enthusiastic about both products. Advanced TV has some features that are interesting to me - they might even be enough to get me to leave my DirecTV / TIVO. So when I received an email that offered a chance to see Advanced TV in action, I was interested. I clicked on the link, only to find that the nearest store was a 100 mile round trip for me. I am interested, but not $20 in gas interested.

I determined to tell AT&T about this failure in marketing on their part. It seems like a simple enough thing to target their email - they know my address, they know the location of the store - how hard could it be to cross reference the two and eliminate me from the mailing? So, I went to the AT&T site to send them a quick feedback note. I had to do this as the email they sent did not have a reply address.

First, can you spot the contact us link on this page?

It is in light grey text, on a white background, at the very bottom of the page. It is hard to spot. This may be intentional, as they have several very clear help sections on the page. This may be an attempt to drive traffic to the help pages, rather than the generic contact us area.

Once I found the contact us / feedback form, I opened the page to find what looks to be a nice easy form to fill in, set up as a wizard, starting with your subject line.

Entering a subject, does not start a wizard. It instead sends you to a list of pre-selected subjects. This is where I started to become frustrated. None of these pre-selected subjects is related to my email. I tried several different terms and subject lines, I opened all of the suggested subjects. Not one related to marketing, nor was there a "I have a different subject" for the email.

In the end, I randomly selected a topic and gave them my feedback. It likely will not reach anyone interested in my feedback, and the marketing department will be wondering why their promotion is not working. They sent it to people who said they were interested in Advanced TV. Why is the conversion rate so low?

And the result of all this work on their part? I now regret having AT&T for my phone company and my Internet provider. I am concerned that I will not be able to find help when I have trouble with my service. I am no longer sure the features of Advanced TV are worth the trouble of changing television providers. Do I want to trade DirecTVs award winning customer service for this?

Overall, an attempt to sell me on something I was interested in has left me with a less favorable impression of the brand.

Why did this happen? Because each interaction did not focus on my needs as a customer, instead they focused on their needs as a company. The company wants to get the word out about Advanced TV. Who cares if some people who get the email are unable to buy the product. The company wants my email routed to a specific department. Who cares if that adds time before we satisfy that customer?

We, as interaction designers and usability professionals, need to remember that the value the user derives from the experience needs to be paramount. When we forget that, we drive people away from our sites and away from our brand.

Monday, November 10, 2008

NEOUPA Event

FOR IMMEDIATE RELEASE
Contact:
Cathleen Zapata
President, NEOUPA
NEOUPA
P.O. Box 24503
Cleveland, Ohio 44124-9998
Phone: 440-320-1084
president@neoupa.org
www.neoupa.org
NEOUPA Celebrates World Usability Day in Cleveland, Ohio
Cleveland, OH -- November 1, 2008 -- NEOUPA, the local chapter of the global Usability Professionals’ Association, is celebrating World Usability Day Thursday, November 13 from 6-8:30pm at KeyBank’s Tiedeman offices by discussing how professionals in the community are infusing and advocating usability in the companies they work for and the Web work they do. The panel of presenters is from a variety of local organizations, including Ernst & Young, Cleveland Institute of Art, KeyBank, Progressive Insurance, American Greetings, and more. Additional details and registration can be found at www.neoupa.org.
World Usability Day was founded in 2005 as an initiative of the Usability Professionals' Association to ensure that services and products important to human life are easier to access and simpler to use. Each year, on the second Thursday of November, over 225 events are organized in over 40 countries around the world to raise awareness for the general public, and to train professionals in the tools and issues central to good usability research, development and practice.
"Web users today are task-oriented. They don’t have time to waste and they’re on a mission to get done what they’re trying to do," says Cathleen Zapata, President of NEOUPA, "Usability is the fundamental foundation of creating an outstanding customer experience that meets both customer needs and business goals."
The event is FREE but registration is required at www.neoupa.org. Food, networking, knowledge-sharing and the chance to win over $300 in prize giveaways are all part of the event.
This year’s platinum sponsor is Brulant, Inc./Rosetta, with additional sponsorship by SMI: Eye & Gaze and Progressive Insurance. The event is being held at KeyBank, 4910 Tiedeman Road, Brooklyn, Ohio. Registration is required at www.neoupa.org.
About NEOUPA
NEOUPA is the official Northeast Ohio chapter of the international Usability Professionals' Association. NEOUPA was started to educate, motivate and promote usability throughout Northeast Ohio for individuals interested in, involved in or responsible for Websites, applications, software or any other type of user interface where usability is a key to success.
For information: http://www.neoupa.org or
Contact: president@neoupa.org
Phone: 440-320-1084

Wednesday, October 29, 2008

ASIS&T 2008 - Connie Yowell Plenary Session

This is the second plenary session with Connie Yowell discussed the MacArthur Foundation's $50 million digital media and learning initiative. This was a five year look at how digital technologies are changing the way young people learn, play, socialize and participate in civic life.

Typically, children list video game or PC activities as their most popular activities. She demonstrates the great number of variables children need to manipulate to play Pokeman. Showed a video of kids playing Pokeman together, and their activities they do around Pokeman - blogs, web searching, comics, and fan fiction.

Some statistics:
97% of kids play video games
60% of teens use a computer
72% use instant messaging
50% have created media content
33% have shared content via the internet

Ethnographic study
700 participants
25 researchers
5000 hours of observation
www.futuresoflearning.org

Types of participation
1. Friendship driven participation
2. Interest driven participation
These interactions are the same as "real life" interactions. It is not a place for adults. It is not a place for strangers. Young people do not interact with strangers in social networking.

Interest driven networks are highly social. The worries about these interactions being socially isolating is incorrect. Social interactions tend to be very individual based. They tend to be very participatory and productive. They use these networks to create content. The networks are peer based. This is where adults and strangers interact with young people, because of a similar based interest. These networks have converging media, so TV works with PC works with books, not competing. For the young people it is about the content, not the technology. So they are following their interests across different media. The learning is networked.

Learning is happing outside of school. School becomes another node in their learning, not the primary node. Rich array of learning opportunities, with low barriers to participation and production.

Some of the pitfalls are around developing expertise in areas that are harmful, there are strong commercial influence. There is a concern that there will be a different set of skills with each kid, worse than the digital divide, as the set of skills needed are growing. There is also a fragmented experiences, not every kid has the same set of experiences, creating different behaviors. Each experience stands on its own, and there is not a strong interrelation between one experience or the next.

Four core skills in detail out of eleven identified for the future, over and above traditional literacy

Performance - the ability to adopt alternative identities for the purpose of improvisation and discovery. People will need to be able to adopt roles to explore and understand environments instead of the traditional positioning.

Appropriation: the ability to meaningfully sample and remix media content. Otherwise known as plagiarism or piracy. Think instead of the way Shakespeare reused or remixed classic tales in new ways with his plays. Digital media makes this easier.

Collective intelligence: ability to pool knowledge and compare notes with others toward a common goal. Everyone knows something, but no one knows everything. This is about the ability to tap the community at the right time, and to get answers from everyone.

Transmedia Navigation: the ability to follow the flow of stories and information across multiple modality and different forms of media.

How do people pick up these skills, especially as schools ban the use of these skills?

Places to stay up to date on this work and the findings:
www.spotlight.macfound.org

www.digitallearning.org

www.holymeatballs.org

www.newmedialiteracy.org

www.idiit.edu/thinkeringspaces

ASIS&T 2008 - The effect of page context on magazine image categorization.

Stina Westman of the Helsinki University of Technology presented on a image categorization study she completed. This work was on the contextual factors on categorization, specifically the context of the page on image categorization. Prior research has shown that people evaluate images with high level semantic descriptors, and that people describe image content on interpretational, as opposed to perceptual factors. The goal of the study was to determine if there is an effect on categorization based on context, and if so, how strongly does it work. They worked with professional magazine image archivists and split them into two groups, one with context, one without. The participants began with a free sorting exercise followed by reassignment to multiple categories.

The number of categories, the time taken to sort the photos and the number of times a image was placed into a category were all not significantly different. When you look at the types of categories that are created, there was a significant difference. Added context resulted in more categories based on theme and story, versus functional, or of the objects in the photo. People were grouped by fictional or real, and nonliving items were grouped by symbolic, object or scenes when context was present. Without context, people were grouped by posed photos versus action photos (context) and non living items were grouped by interiors, objects, or scene. Without context, images were more often set into multifaceted categorization, more hierarchy to the structure.

Text was seen to anchor the image, explained the image and why it was published, or the text was seen as elaborating or extending the image. This means we can manipulate the categorization of images by the archivists, so can determine how we want the image to be categorized. This implies that text data mining can be applied to image categorization through automated / software.

ASIS&T 2008 - Revisiting search task difficulty: behavioral and individual difference measures

Jacek Gwizdka of Rutgers University presented on a series of studies he completed on task difficulty and search. The dependent variable on the level of difficulty were relevance, so the more documents found the higher the task difficulty. The strongest predictor was the number of pages visited, so the more documents people opened, the harder they found search to be. This seems to follow my intuition, that if we can return the best documents, higher in the results and give the user a clear good document, they will find search easier, and better.

He found that there was a correlation between objective and subjective task difficulty. He found that subjectively difficult tasks were related to more search actions, greater than objective difficulty. So if people think the task will be hard, they do more work, regardless of the objective assessment of difficulty. He also found that better search task outcomes are associated with lower levels of objective difficulty. In terms of the efficiency of the systems, people are slower on more complex systems. They find more complex systems to have a higher level of subjective difficulty.

ASIS&T 2008 - Values and Information

This panel discussion was on the value of information.
Discussion covered the difference of value of real objects versus digital objects, such as pictures, books and music. Generally the digital objects were seen as easier to have, but not as valuable.

Another study ran a value survey against three different groups, one in corporate, one in academia and one in government. This survey showed significant difference in the values of curiosity, loyalty, and obedience. The studies show a difference in adherence to values by different organizaitins. Differences in organizations lead to different values wihch in turn lead to different priorities and potenetially different designs.

Tuesday, October 28, 2008

ASIS&T 2008 - Google on-line marketing challenge: A multidisciplinary global teaching and learning initiative using sponsored search

Panel with Bernard Jansen of Pennsylvania State University, Mark Rosso of North Carolina Central University, Dan Russell of Google, Brian Detlor of McMaster University.

Jim Jansen jjansen@acm.org
Mark Rosso mrosso@nccu.edu
Dan Russell drussell@google.com
Brian Detlor detlor@mcmaster.edu

The Power of search and the web - Jim Jansen's slides, Mark Rosso presenting.

Search drives on-line activity. Search drives over 5 billion monthly queries in the US. People are spending less time with their family, the TV and less sleep.

Looking at the search marketplace, Google has over 60% of the market, with no other company having even half as much market share. On-line marketing is a large business, and keyword advertising is the fastest growing advertising business. Google earned 16 billion from advertising, mostly keyword advertising. The sponsored links are the main drivers of revenue and are the business model of all search engines. Key differentiator is that it is targeted, pull aligned and has the lowest acquisition cost of any advertising.

The Google on-line marketing challenge was a world wide project. The challenge consisted of 1) register the class 2) recruit the client 3) Students develop a pre - campaign proposal 4) Students run a 3 week AdWords campaign 5) Students develop and submit a post campaign summary 6) Google judges teams on a algorithm to narrow to 150 teams 7) Academics judge teams on written reports 8) top ten teams fly to Google headquarters.
A campaign consists of one or more ad groups. Ad groups should be based on a theme. Each ad group has a set of ads and keywords that generate the ad. Each keyword has a bid, how much are you willing to pay for that keyword. For each keyword you supply a matching criteria - exact, partial, broad, negative.

Ads display, the rank and the cost per click depend upon: keywords in query, ad title and text must be relevant to query, content of landing page must be relevant to query, past click through rate, bid on keyword, other configurable factors.

The Google on-line marketing challenge - experiences from the course. - Mark Rosso presenting.

Public school, evening MBA program completed as part of a Management Information Systems course. Class consisted of 10 students, so two teams entering the challenge. Clients were a small law firm and a African American cultural center. Entered into the course as a experiential learning opportunity and to answer common criticisms of MBA IS courses, too theoretical and not adequately conveying "know how". Students took from this class real world experience in on-line advertising will work, or not work. The on-line advertising did not really work for the cultural center, as they were not selling anything on-line and did not have a way to measure the value from a click. It did bring home the need to coordinate their projects with the information systems function. Both teams conflicted with client scheduled web site outages. It also brought home how the design of the web site impacted advertising costs. The cultural center had a single page design that kept them from targeting a page for the ad. The law firm did not have this same challenge.

The Google on-line marketing Challenge: an outside commentator's opinion. - Brian Detlor presenting.

Three main wins - Students love the challenge, they get job offers. Teachers love it, Students are engaged and there are real world evaluation of student work. Participating businesses love it, they get free work and a chance to screen potential employees.

Brian's school does other experiential learning courses, but not this specific one. He sees some challenges, but thinks it is worth while.
Google on-line marketing challenge: a view from inside - Daniel Russell (the search quality, User Experience Research, Ads Quality & Search Education person)

www.gomcha.com is a social networking site for the challenge. Place for the students to continue to work together. Google wanted to do was teach students the in and outs of marketing in the internet age, giving them real world data on what is good, what is bad. They should also understand how web readers look at + understand web pages. Plus they begin to understand the consumer versus reader psychology. They wanted to inject these ideas, about how large the search marketing market is. As an example, specific keywords are better than generic keywords. Negative keywords are important to learn as well - when not to bring people in. Ad quality is about the same as Search quality. Having good sponsored links will cause users to see it as valuable. Search query length in the US is typically 2 words. Most queries are multiple words.

Web Site Optimizer allows you to do trade offs between multiple versions of pages. This allows you to understand and optimize your conversion rate. You can do this with only text changes, or larger site changes. Nice tool for learning how your site works, our EY.com team should leverage this.
The impact of the study was that the students left with a better understanding of the web ecostructure.

Advertising & Awareness with Sponsored Search - donturn@ischool.utexas.edu.

An Alternate study. Examine the idea of advertising for awareness of the UT iSchool. Ads for graduate studies, not driven by click through revenues. They developed an ad campaign to evaluate the Google Ads application. Goal was to build on models of web information seeking as part of the larger web experience.

Planned global & local search campaigns using statement and question ad copy for comparison. About 208000 impressions, 200,000 global. 160 click through rate (CTR). Their average position was 4th. The best click through rate was on the local campaign. The statement ad copy out performed the question ad copy. Gave them a large dataset from sponsored search services and tools. Start to understand more complex search tasks is of growing importance.
donturn@ischool.utexas.edu

Gave an example of how words matter. Changing the word on the link from Sign up to Start using caused a 5x improvement in conversion. This has interesting connotations for our work on site design and metrics. We need to track button clicks, and be able to change things fairly quickly, so we can determine which links work and which do not.

ASIS&T 2008 - Better to Organize Personal Information by Folders or by Tags? : The devil is in the details.

The presentation was by Andrea Civan of the University of Washington.

Examining the two different models of organizing information: hierarchical folders and tagging with labels. Does placing with folders and tagging make any difference in the ability to find things. The faculty advisor on this project was William Jones. Had people use two real world systems, for a period of time, then had them relate their experiences back to the project. Used Hot Mail and gMail as their study, as they are two different tools, achieving the same function, via two different mechanisms.
Initial interview to explore their use of folders and tags in the past. Selected two topics of 25 item collections. Each day the participants received 5 articles in their email, in each product. They then spent 5-10 minutes organizing the information. They self reported on the experience via email, as well as data capture by the project team. After the project they gathered recall details, re-find 5 articles, and had them sketch their collection. Each participant was exposed to both environments, serially.

Results - a number of similarities, from a retrieval performance, recall, time to retrieve, number of places looked and in terms of the organizational schemes both systems worked about equally well. In both conditions the participants were unable to express the complexity of their internal map of the information, as expressed in the sketch. In general, there were multiple categorization activities going on in each environment. Some participants had a workflow orientation, others had a hierarchical orientation.
Tagging required less cognitive effort. Participants found the tagging to be easy, whereas they felt a strong need to be choosy with the folder names. Tagging required more physical effort. Tags needed to be applied over and over again, where as the folder structure only needed to be placed once. Folders made it easy to hide information, easy to move items out of the in-box. Tagging was seen as more cluttered, as everything stayed in the in-box. Folders were better for systematic search, each time they could look through each folder and be confident that they had correctly cleared a folder. Multiple tags were applied to an item, making it harder to do a systemic search, but easier for serendipitous finding.

The conclusion is that there is no clear winner. Each structure offers tradeoffs. The implications: don't leave the good stuff behind. Filter for untagged items, support hierarchy. Support collections of tags that are content and format oriented.

ASIS&T 2008 - Social Tagging in China and the USA: A comparative study

Chen Xu of the Long Island University and Heting Chu of the Long Island University

USA usage of tagging 28% of on-line Americans have tagged on the internet.

The study used two web sites to compare tagging, both the action of tagging and the interface used to tag, in US and China. del.icio.us and 365Key were the two sites studied. 365Key is the first, and most popular, tagging site in China. Study was done in January 2008. The study gathered site visits and extracted the tags from the News category. They specifically narrowed the study to tags used more than 10 times.

The sites are similar in functionality, although 365Key adds two fields: Comments and Evaluation. Evaluation gives the ability to "star" a URL. 365Key also adds two metrics: viewing frequency and bookmarking frequency. In terms of tag management, Delicious gives users more freedom to create tags, ability to bundle tags and to rename or delete. Tag bundle allows you to create a second level in your tagging, creating a hierarchy. 365Key offers predefined categories, meaning their tags tend to cluster around the predefined categories.

Users in Del.icio.us create 5 tags per record, while 365Key create about 3 tags per record. This means that Del.icio.us will have more unique tags, with less consensus. When we look at the tag frequency distribution of the two sites, they are very similar. When we look at the parts of speech of the tags, they are mostly nouns in both sites. 365Key showed more foreign terms for tags, specifically English. This is seen as an indication of English as the "world language". Del.icio.us have more term variants, and more "net words".

When looking at tags with the same meaning, we see that there are a number of terms that appear in both sites, but the ranking of the terms is significantly different. This is seen as showing the different issues of importance amongst the different user populations. When looking at the top tags in each site, we see cultural differences.

The main difference between the two sites is in the functionality. 365Key is more restrained, with defined categories. Del.icio.us offers more freedom to the users. This difference in functionality is reflected in the tags. Interestingly, there were more tags applied in 365Key than in Del.icio.us. This is attributed to the time period when they gathered the data, it was not an active time in the US. There is strong clustering around the predefined terms in 365Key.

Monday, October 27, 2008

ASIS&T 2008 - Proflection in Cyberspace

Gary Marchionini
UNC School of Information and Library Science

We have a number of identities in cyber space, some under our controls, some not. None of these are actually me, they are reflections of me. My reflections on intentional and ambient. Intentional are reviews, friending, replies and so on. The ambient are vendor profiles, citation or credit rating. There are also projections of me - what I put forth, my intentional and ambient projections. The intentional are web pages, facebook profile or email. The ambient are click streams, records of purchase, sensor streams.

Together, the projection and the reflection make up my "proflection".

We should think through how to control these items, how to protect ourselves. What kind of "digital condom" do we need to protect and control our proflection on the web? These tools might be monitoring tools, might be warnings - might just be awareness tools.

ASIS&T 2008 - 3 Digital archiving myths

Cathy Marshall
Microsoft Research

Three digital archiving myths - keep everything, kids will know what to do, it is all a matter of repositories and access control.

Storage is cheap. We should keep everything.
Digital items are very small, so more material can be stored. Storage is cheap, but human attention is not and it is not legally, emotionally or intellectually viable to keep everything. It is easier to keep than to cull, but it is also easier to lose items.
Flicker has 3 billion photos, Facebook 5 billion photos - this is overwhelming.
Much of what we keep today, does not have much value - benign neglect has its virtues in how it automatically culls our stuff.

Today's kids are all-digital, they will know what to do.
Kids are fearless.
Kids are still likely to rely on a family to handle archiving.
They are better at creating, but they are no better at keeping stuff around.

Digital stuff is not just distributed over multiple stores, it is also distributed over time and over people - and system design has to allow for this.

She ran out of time before completing her exploration of all 3 myths.

ASIS&T 2008 - Old People Facebook Disasters

Old People Facebook Disasters
Fred Stutzman, SILS UNC - Chapel Hill

Fred has been studying privacy for years, and been a bit concerned about the news coverage. The narrative has been about the loss of privacy on the web. He has been doing studies on privacy and usage of the web. When we look at people who have been using facebook for more than 3 years, people are more likely to make their profiles friends only, and have set more privacy.

As older people join Facebook, they are having to learn these privacy norms that earlier users have already learned. "In the future everyone will be anonymous for 15 minutes."

Step away from making broad based generalized assumptions, and towards more of a nuanced understanding of privacy.

ASIS&T 2008 - Tagging as a Communication Device: Every tag cloud has a silver lining

Heather D. Pfeiffer is the moderator of the panel.

The panel will present on, and discuss:

Outcome and processing of tagging
Relationship of tagging to social networking
Connection to language and its constructs
Tag use in Ontology building

The panel consists of:
Heather Pfeiffer - New Mexico State University
Emma Tonkin University of Bath
David Millen IBM
Mark Lindner University of Illinois
Margaret Kipp Long Island University

Tagging as Metadata
Heather Pfeiffer
Knowledge in language - explaining how we can community knowledge through Syntax, Semantics, Pragmatic or context.
Syntax and semantics are the two parts of language, where as pragmatic is the use of the language.
Conceptualization of Ontology.
Concepts on the surface are labels or terms.
Underneath have a meaning.
Meaning grows out of the context of how the concepts are used.
relationships are built between concepts.
Building a hierarchical structure called an ontology.
Tags are concepts which need context to have meaning. Do we have a change in our language?
This presentation really did not talk to tagging as metadata, but instead talked to the use of language. Not as much as I would have liked.

Ten minutes of language development
Emma Tonkin

Started by talking to the development of language, looking at the etymology of country names. Village = Canada, Little Venice = Venezuela.
The three step ontology development plan - identify concepts, label the concepts, identify relations, document then use them.
This assumes perfect accuracy in the identification and labeling processes. Realistically, collaborative identifying and discussion something is hard, even something as simple as location.

Labelling locations, some positions are important temporarily or permanently. Place versus space, position versus location.
Simulation design - concepts are physical positions in space, labels are the names for the positions, Agents negotiate labels for shared concepts, sharing.

Example of a pharmacy versus drugstores. Talking towards the differences in languages, the possible ways of naming things, coming to consensus.

Discussed mainly how the language will be different over time, over place and as we change our understanding of the world.

Patterns of Collaborative tagging in a large corporate environment
David Millen

Social bookmarking behind the firewall is different, in that the links are behind a firewall, which can be linked up with enterprise search. This leads to social discovery and search. We believe there will be a difference in use depending on the goal - community browsing, personal search, explicit search.

There are multiple paths to information. Showing how the amount of click through on links, search and so on for tags. Search has a high percentage of click through, as does my tags. Community tags have a lower percentage of tag click through as compared to views.

Focus of this study:three groups - bookmarking behavior over time, different sharing (public versus private), focus on internal versus external. Looking at participation rates showed a continued increase in number of new users, with ~50% of unique posters over time. Unique posters are the number of people adding new terms to the tags versus people using existing tags. So about half the people actually add new items and new tags, the other half just use existing.

More people tag for internal resources, than external resources. They classified tags by topic, content and owner. Each user group used group based tags on both the internet and the intranet. Low percentage of private bookmarks. The private bookmarks had a very low number of tags, while the public bookmarks had a higher, closer to public internet level of tags.

Roles in Social Tagging: Publishers, Evangelists, Leaders.

The emerging role as the evangelist - this is a person who is trying to raise visibility of a topic through the use of tags. A self serving use, to draw attention to their own personal material.

Publishers are using tags to drive traffic to a particular set of resources.

Small team leaders are using tags to share resources within a team, so that the team can find resources. This is an additional, not only, method for doing this and it is done by convention, not thorough system uses.

Tag similarity in use ( within enterprise ) - terms that we expect to cluster together, do not when actually looked at. When they asked users why they used a different term for the tag. People fell into different groups - some agreed they should have used the alternative, some strongly felt that the terms were different, others were using them to drive search results, or they were thinking in terms of future use for finding.

Tagging games used to induce tag creation within the organization. Wordel.net has some interesting tag clouds.

Integrating tagging: tagging as integration
Mark Lindner

Is talking about the difference of linguistics in tagging.
Quick overview of integrationsim.
Community as macrosocial.
Towards integration.
Tagging as integration.
Integrationism.
Theory of linguistics and communication.
Opposed to segregation accounts
Spoke to tagging as a integration process that gives us the ability to bring together language. I did not understand how this presentation fit in with the overall panel.

Communication in Tagging: Collaborative Classification
Margaret Kipp

Tagging as a collaborative classification.
Tagging is often examined as a form of collaborative classification.
Multiple studies have examined the consensus shown by frequency graphs.
Studies have also looked at tagging as a form of user classification.
Or as personal information management.
The majority of tags are subject related. The other tag categories are affective, time and task tags, project tags, conference tags.
Most frequently used non subject related tags are : fun, toread.
Tagging as a discussion of about-ness.
The frequency graphs of tags shows convergence, but detailed examination shows that everyone is using very different tags, with a different sense of agreement.
Tagging is also used as a form of review or criticism. You see tags that are mini one word reviews of the site, like boring, stupid, interesting. This is an interesting use of tags, but is not a good candidate for information retrieval. Mixed with other terms and other uses, they may be more fruitful.

ASIS&T 2008 - understanding and supporting multi-session web tasks

Bonnie MacKay of Dalhousie University is presenting on a study she completed as part of her PhD program.

Multi - session tasks are goal based, require more than one session to complete, may be expected or unexpected. An example of a multi-session task was the scheduling of the trip to ASIS&T.

The objectives of the research task was to identify the characteristics of multi-session tasks. To understand how users perform multi-session tasks, to determine tools or processes that might be a better way of completing these tasks and then developed three prototype tools to improve. The study was a diary study and a field study.

In 2006 Melanie Keller gave a similar talk on this same topic. I am hoping that this will expand upon her thoughts and findings.

The study resulted in three main suggestions for browsers tools for multi-session tasks.
1) Tool should keep a list of current multi-session tasks.
2) A reminder to support multi-tasking during web sessions.
3) Support multi-session tasks between sessions.

They created three protoype tools within Firefox.
First tool keeps track of your task, putting a reminder of the task you are working on in the tool bar.
Second tool automatically manages pages between sessions, highlighting the ones that are related to the task at hand in bright yellow.
The third tool added a to-do list with a reminder function to the browser. This third version included an archive function ( stores the data and the forms for future use), a tool to manage pages between sessions, a landmark feature to take the user to where they left off in the form.

The next step was to run a study with the prototypes, having participants use these prototypes. The study showed that the prototypes helped users stay on task during multi-session tasks. Interestingly, users were willing to trade off ease of use for features. The first prototype was the easiest to use, but prototype three was highest rated, due to the additional features of the third prototype.

ASIS&T 2008 - Concept Theory and the role of conceptual coherence in assessments of similarity

Louise Spiteri of Dalhousie University.

The presentation is on how we determine similarity and difference. She began with a dog and cat conversation. From the perspective of a alien, the clear differences we can see between a cat and a dog, are not as clear without the same frame of reference.

The notion of similarity or likeness underlies most approaches used in the design of bibliographic classification systems. In its simplest sense, classification is the arranging of things according to likeness and unlikeness.

The presentation now goes into the different classification systems and theories, discussing the differences between the theories, and the drawbacks of each.

She has been researching better ways of applying attributes to Folksonomy by using facets that are salient to the end users. These facets would be related to the original tags, not a predetermined set of tags. This was the most interesting and applicable point in the presentation, and it came as a off handed comment without any further detail.

Sunday, October 26, 2008

ASIS&T 2008 - Plenary Session

Speakers are Genevieve Bell, Howard Rheingold, Andrew Keen.

Genevieve Bell is an anthropologist who works for Intel, on studies on how people use technology.

The many future of the internet. The intranet is more than just the technology. Democracy, transparency, openness and accessibility of all information are cultural values, but they are not everyone's values - nor are they any one area or locations values. They are unique to the internet.
We are at a pivot point of technology, as it moves off the PC and into the mobile area as well as consoles, and so on. These changes are impacting want can be done, and what we want to do with the internet. Applications are becoming more transacting, not immersive.

Interesting story on internet use in Africa. She spoke to a woman with no power, no PC and no mobile phone who used the internet every day. She had her son come in every morning, tell her about her incoming email and take her outgoing email. He then went and sent the email to her recipients. So even though she never used the internet herself, she uses it daily.
Bulk of the TV viewing on PC is done by women between 25 - 45, not the typical vision of a technology first adopter.

There are a number of examples she gives that demonstrate the new internet. In particular, there are more Chinese people on the net now, than American. This will never change, given birth rates. This means English is no longer the dominate language, which raises questions for translations as well as culture. There is a lot of issues that exist in Chinese and Hindu versus English, particularly as Chinese and Hindu are more poetic languages where words are often used for multiple meanings depending on the context, the simile, or the metaphor. Think Cockney Rhyming slang.
Infrastructure has a extraordinary impact on how people use and experience the internet. With people having the same speed up and down, people tend to submit more and when the speed down is faster than the speed up, people tend to consume more. Gives an example of iPlayer in the UK, which single-handedly degraded the internet for the entire UK due to demand for IP TV from BBC.

Talked to several different cultural clashes over application of technology, who wants to be in the conversation, what the conversation means.

Government, technology, society and religion all intersect in the application of internet and technology.

The things that worry us about technology have changed. We are now more worried about authorship, ownership, reputation, authenticity, access, digital literacy.

First challenge is there may never have been a single internet, but looking forward there are multiple different internets in the future - differentiated by access method, country, desires.

Second challenge is that there are people trying to get unplugged from technology. People consciously choosing to be disconnected.

Howard Rheingold begins the discussion response to Genevieve's presentation. We need to teach our children how to access content, and tell what content is valuable. Knowledge transfer is not about taking notes and spitting the notes back out on the test, but is instead about learning what you need to know to accomplish a task and survive in the world. Note taking doesn't automatically create that knowledge transfer.

Andrew Keen gives a response to Genevieve's presentation. He wonders how Intel wins on the basis of the information she presented. The internet is a philosophical movement. A book by Fred Turner called from Counter culture to cyberspace lays out the philosophical movement of the internet and its ideology.

The internet is not the real world. The real world is increasingly ugly. Battles are brewing on the internet, as an example the Chinese versus English internet. The real world is worse, there are people are losing their lives, losing their freedoms and losing their livelihood in the real world, and the internet is not helping.

Digital Fascism - similar to the industrial revolution, there are people who are disenfranchised by the information revolution. What will happen with these people?

Saturday, October 25, 2008

ICKM 2008 - A taxonomy model for a strategic co-branding position

Kuan-Chi Chang - graduate student Tamkang University
Discussed a taxonomy model he created to classify different co-branding situations.
Had objective indications and subjective indicators for evaluating the taxonomic term to be applied to the cobranding M&A.

The discussion around how to leverage the taxonomy information to determine in advance where culture and business factors are likely to cause a failure of the merger was more interesting, in my opinion, than the presentation.

ICKM 2008 - Search Fallacies

Jay Ven Eman - CEO of Access Innovations.
www.accessinn.com

Search - doesn't work!
People are spending more time looking for things, and less time analyzing the results of that time.

Mismatch between search software, audience and contents. Lacks the context of the environment.

The Holy Grail of computer science is understanding human language.

Taxonomic strategy can save search.

Taxonomy is like a map, giving the latitude and longitude, or like the Rosetta Stone.
Search is like a treasure map, fun - but not always accurate.

Taxonomies in action
www.mediasleuth.com
www.ask.com

Information strategy
User needs
Business drivers
Information flows
Metadata strategy
Taxonomy
Indexing
Structural elements
Promotion, advertising, training
Maintenance, upkeep

Information strategy must be done first!
Then shop for the search software.

Select search software with the features and functions that will drive your business.

ICKM 2008 - Integrating Folksonomies into cultural heritage digital collections: the challenges and opportunities of Web 2.0

The speaker was Daniel Gelaw Alemneh of the University of Texas.

Daniel spent time defining Folksonomy and the advantages and disadvantages of Folksonomy. I've asked Daniel for his slides, and will incorporate his points if / when he sends them to me.

The question Daniel wanted to explore was "How do we work to bring together Folksonomy and taxonomy?" The concern of the classic taxonomist is that the weaknesses of Folksonomy will infect our taxonomies.

There are examples of libraries that have embraced Folksonomy, such as the PennTags system at the University of Pennsylvania. This is a library tagging system. Another example is the MBooks Collection builder, the University of Michigan's interface to allow students to create their own collections of books.

Commercial examples are Bibliocommons, a social discovery system for libraries and CiteUlike a social bookmarking site for academic system.

If we have these examples, can we be inspired by their work to better integrate folksonomies in our taxonomies?

We should also look at external tools that exploit the structure of the Folksonomy - FolkRank: a ranking algorithm for folksonomies.

ICKM 2008 - Open Source tools for Knowledge Management

Study was completed by Anne Gregory and Dinesh Rathi. Dinesh Rathi presented the results.

They started specifically looking at small non profit organization. The challenges of knowledge management is the same in these organizations, but they have smaller budgets, smaller staff. The study worked with a single small NPO. They began with a knowledge audit to identify the existing knowledge and processes. The audit was done with a open ended interview to gather the information, identified significant gaps.

The organization lacked the organization and storage of digital and paper based documents, past activity details, any tools to organize dataset or sources. They lacked details on their volunteers, their donors, or a way to get details on the particular events. No past knowledge of process for permits.

Two components to KM solutions :
1) Physical Organization and storage system
2) Digital organization and storage system.

The team offered open source products, because there is no budget and no physical space.

4 key identifying information is required to be associated with each document.

Topic
Event, or type of event used for
Format
Title

Proposed a technology and tools based on:

Wiki
Provide an excellent means of organizing and sharing information.
Openness and flexibility of a Wiki helps with adding and editing the new content.
Supports version control.
Becomes the electronic list of all physical documents.
Centralized location for links to resources, FAQ, event planning details, and statistics.

Collaborative Writing and Data Manipulation tools
Google Docs and Google Spreadsheet.
Centralized single storage point.
Easy, secure access to documents.
Anywhere and anytime access.

On-line Bookmarking
On-line is a vital source of information.
Currently relevant website links reside on individual volunteers computer.
Moving to del.icio.us allows for sharing on-line tools and sources of information.
Bookmarks could be saved with consistent tagging.

On-line Multimedia Solution
Group takes lots of photos and videos, but are lost all the time.
Post photos on Flickr and videos on Youtube.
Add links and metadata along with descriptions stored in Wiki, so you can find them later.

On-line Forums, Blogs & Calendar
Continue to use the current on-line forums, as used currently.
A free blog with RSS capability for updates.
On-line calendar via Google or Yahoo for schedules.
Implementation.

The team recommended using these tools for the new content, and not to concentrate on reentering data.

The archive should be updated and moved to the new tools, when there is time and availability.

Core members will set the tone, and establish the new knowledge sharing culture.

Create good "how to" documentation for training volunteers for effective use of the tools.

Looking back at the event, this was about using free web tools, not open source tools. There are some concerns about the plethora of interfaces and tools that are recommended. The team will present the results of their recommendations at the next ASIS&T session.

ICKM 2008 - Knowledge Management The psychology of hearing and the sociology of listening

Brenda Lloyd - Jones Professor of HR and organizational leadership

Goal
To identify contemporary stressors and their adverse impact on community members' overall sense of well being.

Use concepts of KM to explain the need for communities to be responsible for their growth and learning.

Suggest sharing professional knowledge on listening in communities - teach people to listen better.

Psychology of hearing - sometimes we hear without listening. People are heard, without being understood.
Sociology of listening - refers to studying the meaning of words and nonverbal behavior of others. Active behavior of understanding what you hear.
Community - set of interactions and human behaviors that have meaning and expectations.

Empowerment Theory: Increasing community empowerment
Sharing professional knowledge on listening in communities.
Teaching listening techniques as a method to create a community of support.

Society teaches and reinforces speaking, not listening.

Specific techniques taught were: Active listening, Empathic listening, Relationship listening.

ICKM 2008 - Towards a consensus on KM capabilities concepts: evidence from a Delphi study

Jean-Pierre Booto - Professor of IT, owner IBI Canada.

He defined three pillars of discussion within knowledge management, KM process, KM infrastructure and KM skills.

The goal of his study was to define these three pillars and to determine which terms we use in KM best fit under each pillar. The entire scope would be considered as KM capabilities.

Study found that there is consensus on KM process and KM infrastructure, but no consensus on KM skills.

He gave a detailed overview of the Delphi process and his criteria for selecting the experts.

More interestingly, he described a Capacity Maturity Model he is developing for Knowledge Management. It is currently only available in French, but he promised to forward it to me for translation.

ICKM 2008 - Experience in knowledge management measurement: Ecopetrol Story

The speaker is Sonia Castro a 21 year employee of Ecopetrol, with a master degree in Chemical Engineering.

Company moved from a State run monopoly into a independent company. As part of this change they moved to a single KM platform.
4 questions:
1) What is the technology?
2) What do we want to incorporate in the company - what are our principles?
3) Who does what?
4) How do we communicate?

Created a new single corporate program to achieve the answers to these questions, with a new set of tools.

Started with the premises that knowledge sustainability is not a fact, but a process.

5 steps to the process of sustaining knowledge.
1) Agree that there is a better practice - something needs to be changed.
2) Update and develop a the better practice, including documentation
3) Update people's competencies
4) Implementation
5) Sustain - systemic training.

This process helps identify measurable instances. These were used to develop a model, with landmarks and measurement of those landmarks. Strong measurement process, with verification and dashboard for following the process.

They have now implemented story telling as an additional method of gathering data.

Conclusion
How to measure the impact on business of KM is still a concern.
Corporate policy has made it less difficult to implement an efficiency process measurement.
Having a simple and daily model allows people to share a common language which facilitates the communication between the business and the KM group, or the strategy and the tactics.

ICKM 2008 - Knowledge Management at Continental - an integration story of former Siemens VDO

Speaker is Christoph Hechler, a former Seimens VDO knowledge management professional now with Continental in a Project Management and Knowledge Management position.

Need for KM at Continental
Available knowledge for everyone: in time, in place, in quality.
The decentralized organization needs a central knowledge management system.

Started in this role because he saw, in his former position, the problems of not having the right information at the right time. An example he gave is that he was in the middle of a study and discovered that another group had done the same study, on the same microcontroller, a month previously. This could have been solved with a better KM system.

ExAS is the name of the program - Excellence Achieved by specialists.

Three main systems:

Corporate Yellow Pages. This is a people page, that is entered by the user.

Communities of practices - controlled by the organization, limited number of communities based on knowledge review of request, similar to our CHS process. This is deliberate because he has seen uncontrolled communities of practice leading to too many communities on a particular subject matter.

Wiki - very new product for them. Having a cultural problem, German company and they are uncomfortable with the open nature of wikis.
None of the content is anonymous - when you upload information your name is attached, thereby helping ensure a good level of professionalism.

Specifics of Corporate Yellow Pages
ExAS contacts - search engine delivers relevant profiles, contact data, specialists profile, experts profile. About 6,000 of 150,000 have completed this so far. Information presented is similar to our planned P2P pages.

Specifics of a community
A community is a group of people with a common interest in a defined subject. Who are working on business relevant topics, exchanging information and knowledge using one common language, providing a set of useful documents, beyond organizational boundaries of time zones, with a motivation to develop best practices and increase core competencies between members.
Every community is closed, you have to apply for membership.
The company uses English as their lingua franca. Even though they are a German company - English is their company language. Functions included within a community include: discussion, announcements, search on community (includes full document search) and a subscription model.
Interestingly, they have a consistent left hand navigation for every community.
~ 100 communities at this time.

Corporate WIKI

Sharepoint platform.
8000 pages so far.
Links to norms, standards, guidelines, and other documents.
Glossaries.
Terminology.

Next steps
Develop a single overarching environment.

Lessons Learned
Lessons learned is a tool that gathers the experience from each project within the company. Instituted a structured process to avoid two issues, 1) unstructured data and 2) not generally applicable lessons or too simple lessons.

The process begins when the lesson is entered by the user, it is then forwarded to a expert in that area, only then added to the lessons learned.
The process is similar to our Knowledge review process.

I wonder if we talked to our "knowledge" as lessons learned if it would have more resonance and impact on our users?

ICKM 2008 - 2nd day keynotes

First speaker had an interesting way of viewing knowledge management.

He views knowledge management as the curation of resources, the design of spaces for information retrieval, the examination of the value of information in the discovery process.

This different perceptive leads to interesting set of responsibilities:

The protection of citizens' rights to access and share information.
The analysis of how appropriate information access benefits a society.
The support of dynamic ecologies for learning, wherever they are.
The examination of policies and procedures governing access and use.
The examination of the value of information in the discovery process.
The curation of data collections for quality hand detail over time.
The design of data human interfaces that enable exploration.
The organization and presentation of data for exploration and use.

The second speaker is Tom Froehlich from Kent State University.
He reviewed the Information Architecture Knowledge Management program at Kent State University. He discussed the program, its challenges and its growth.

He discussed some of the interrelations between tacit and implicit and explicit knowledge. This impacts directly to knowledge "transfer" in a university course. They are trying to change this by aiming the courses to be 1/4 theory and 3/4 practice.

The material he covers is not as applicable to us, although it is an interesting perspective on the education Sarah Bond and Jason Richardson have gained.

Thursday, October 23, 2008

ICKM 2008 - End of Day One

That wraps up the first day of the event. Overall, I found it very stimulating. I have a number more session posts, but I am headed out for dinner and don't want to sit here copying and pasting. I'll do that later.

I found the PM and KM sessions right at the end to be a bit simplistic in their view point. They stressed the lack of collaboration in a typical project. I find that the projects we run in the KWeb are very collaborative, without any of the overwhelming bureaucracy discussed.

I did think that the project recap / post mortem conducted with video and audio taped discussions was a good idea. I wonder if we should add something like that to our process.

The metadata / tagging discussions were a bit of a let down, to be honest. The titles sounded very exciting, but the actual work they were doing was not as interesting. The data on China and tagging will be useful. The taxonomy management reports described were interesting, and could be a good GMMS point release at some time - although they may already be in there.

I have to say that I really enjoyed the keynote speakers. They both gave very different takes on knowledge management.

Looking forward to tomorrow. There look to be some good sessions on communication, collaboration and knowledge sharing, knowledge management strategies and around managing the challenges of complexity in knowledge management.

One of the very nice things we received from the conference is a book, hard bound and clearly of nice quality, with the papers the talks are based on.

It is interesting to compare this to the IA Summit. At the IA Summit everyone has a PC out, there are powerstrips everywhere and wireless is free. Here there are no provisions for people with lap tops. The IA Summit blogs the conference officially, all of the decks are on slideshare right after the speaker is finished and the entire conference is podcast as well.

Here, I need to beg for the slides, the handouts are higher quality and there are few blogs on the conference. Very interesting how this highlights the differences between the two conferences.

ICKM 2008 - Quality issues in Metadata Management

Maintaining Quality Metadata: toward effective metadata management.
University of North Texas library Digital Initiatives program.
Started a number of collaborative projects with other state libraries and museums.
e.g. CyberCemetery
Congressional Research service archive
World War Poster Archive
Electronic Theses and Dissertations
http://www.library.unt.edu for others.

Built a metadata management sustem to establish a standard for all of these different tools. Started with a qualified Dublin Core based descriptive metadata.

Saw two aspects of digital library data quality:

The quality of the data in the objects themselves
The quality of the metadata associated with the objects.

Focus on metadata quality.

Poor metadta quality:

Ambiguities
Poor recall
Poor precision, Inconsistency of search results

Most common errors

Incorrect data: due to letter transposition, letter omission, letter insertion, letter substitution or mis-strokes.
Missing data: elements and values not present at all.
Local requirements
Objects heterogeneity
Granularity
Functionality

Other issues they faced were around the collaborative requirements.

Very diverse user group
Interoperability
Digital rights issues
Training issues - necessary expertise to create and manage metadata

Metadata quality most effected by their knowledge of the source, and their knowledge of the methodology.

All of these issues led to a metadata quality assurance mechanisms and tools.

Two main tools:
pre injust - metadata creation tools, validate the mandatory elements, establish templates, use a controlled vocabularies (UNTLBS)

These templates are applied from the outside onto the content. Uses a web template creator, a outside set of rules and ability to link each field out to guidelines for how to apply them.

Have different templates for each collection, and a set of 5 attributes that are applied to all entries regardless of the collection.

post injust - metadata valuation tools. These tools examine all items in a collection and identify which metadata fields that are not being completed. This allows them to target training and so on. These tools give them a look at how the metadata is being applied. Depending on the quality of the metadata they can establish different services on each metadata fields. Areas tracked are null values, records added per time period, by a clickable map of Texas, clicking on area brings back all items submitted in that area, tag cloud based on terms.

To implement this, be sure to look at the level of quality required, the nature of your gap and how to close it, compromise when needed.
Total size of collection is ~90,000 items.

ICKM 2008 - Is KM Dead?

The presenter defines KM as the process of creating the same set of tools and practices that allow for good research for the knowledge worker. This is an interesting perspective on KM, as it moves the focus off submission and on to the process of using knowledge.

One way he defends that KM is not dead, is situations of the term within articles and publications. For Business, this does not look like a fad. Fads (BPR, TQM) show a hump like curve on this chart. Business still looks like it is climbing, with some peaks and valleys, but not a hump. Education shows it as a fad, so it is not catching on in the education arena. Why? Because they are already doing this today.

ICKM 2008 - Accountability, Professionalism and Performance in KM

The second Keynote speaker at this conference was Patrick Lambe, president of iKMS in Singapore.

He addressed the need of the knowledge management as a discipline to mature as a practice. To demonstrate what he means, he spent a good amount of time defining the terms and how they work in other disciplines, and how they do not work within knowledge management.

Starting with performance, he defined it in two ways, first around operational performance, second around strategic performance. Operational performance deals with consistency, coordination, compliance and cost management. Strategic performance is around intelligence, innovation and capacity building. In each case, he challenged knowledge managers to tie their work back to their companies operational and strategic performance. In real ways, not soft targets.

He explored professionalism through three questions: Is there a coherent practice community? Is there a professional ethic? Is there a reference standard? Comparing knowledge management to medicine, he placed our practice in the middle ages, when doctors never looked at a human body. They had no coherent practice, no reference standard. He then talked to a survey iKMS completed of knowledge managers, showing that on average people spend less than 2 years in the role, are offered no education and seldom come into the role with education in knowledge management. These underscore the lack of true professionalism in knowledge management as a discipline.

When talking to accountability, he stated that it is inconceivable that a knowledge manager would be held accountable for their failures. The excuse that KM is too complex to measure, or that we are infrastructure, not front line are just that excuses.

We need to be accountable for our work.
We need to build a history of experience, with repeatable actions and situations, and incorporate feedback loops for learning.

We suffer from something he called the " teleportation syndrome". People come in, then leave. They become knowledge managers without experience, without training and are asked to go ahead and get on with it. By the time they learn the job, start to see results from their first changes, they move on to a new environment or a new role.

Building performance requires:

scrutiny of failures
sensitivity to the detail
observability of practice
observability of outcomes
a focus on the outcome
and rewarding based on performance.

Not performance on measures like through put, number of items submitted, and other KM process based measures. Instead, performance based on KM's ability to impact the business.

Building professionalism requires:

continuity
progression
succession
authority
integrity
sanctions
visibility
developing ongoing relationships with other KM professionals

To mature knowledge management we need to:

Foster career stability and progression
Accelerate the study of the practice via the community
Incentives for peer learning and peer review
Disincentives for theory driven research
Recognize the neighboring practices and disciplines
Agree and set standards for knowledge management

ICKM 2008 - EMR systems and KM

EMR - electronic medical records.
Problem - low adoption rate of the tools.
Why? assumptions are that Doctor's will not use the tools.
Study shows that the tools are designed without input from the users of the systems. This leads to a system that does not match the Doctor's needs and normal work habits. Additionally, there is a lack of standardization and interoperability.
Recommend that future EMR tools developed using collaborative design.
Collaborative design includes:
Collaboration
Knowledge Sharing and interaction
Sense - making by community.
This leads to tools that are accepted by the community.
Process is based on Open and in-context design, leveraging a small core group, using synchronous communication and asynchronous interaction.
This process is predicted to enable a lower cost, standard product which was trusted by the users as they were intimately involved with the design.
User engagement includes a sense of mutuality, pro-active knowledge for physicians, reducing medical errors.

ICKM 2008 - Knowledge Management in Health Care

This is a case study on using KM in Health Care, specifically diabetes care.
Started when a physician looking for solutions for improving health care. Formed a interdisciplinary team with Rutgers University to develop a theory based practice. They developed a social and technical model of KM, looking to impact both the process management and the social relationships.
Phase 1 : modeling and taxonomy
Phase 2 : Case Studies and Hypothesis Formulation
Phase 3 : Pilot & Full study
Theoretical base - Theory of reasoned actions - what factors predict a physician's intention to perform a behavior. Theory of planned behavior - how do physicians perceive their ability to perform a certain behavior.
People's expectations, and the expectations of their social network, influence their behavior. People's environment influences their behavior.
Research sample-> 4 health care practices located in NJ. They were interviewed, observed and surveyed. These observations were used to develop hypothesis.
The 4 were broken out into high performing and low performing groups. One key differentiation was the introduction of the researcher into the organization. The high performing practices introduced the researchers through a series of scheduled meetings and in a organized way. The low performing groups did not have a organized, in person introduction. Instead, the researcher was introduced in a impersonal note included in the pay packet.
Other differentiating factors were: use of manuals and procedures, meetings, face to face communication. High performing practices use manuals and procedures, have less formal meetings and more ad hoc meetings, and use face to face communication more frequently.
Found that Knowledge sharing is being done in practices, but it is not being done well in even the high performing office. So even within the high performing practices, there was not a consistency of practice and procedures. Your level of care varied greatly from one doctor to the next.
Interestingly, coders were not told which practices were high or low performing, but were able to predict them based on KM practices.
The work they are doing indicates that the Doctor's are trained to work as individuals, not as a group. The plan is to intervene through practice changes, actual changes will be measured via medical records, customer surveys and KM throughput measures.

http://www.scils.retgers.edu/~clairemc
http://knowledgeinstitute.retugers.edu

ICKM 2008

I am attending the International Conference on Knowledge Management in Columbus, Ohio. This conference covers many different areas in KM, extending from the theoretical to the practical. I'll be posting on the sessions as I attend them. The key note speakers really showed this dichotomy. The first speaker was a CEO of a consulting company, talking about a new KM system his firm has created. The second speaker was a KM theoretician, who spoke about the need for accountability, performance, and professionalism within KM.

Tuesday, July 22, 2008

not as radical as I thought. . .

http://en.wikipedia.org/wiki/Generation_V

It looks like my boss might have been surfing the web . . .

Generation V

My boss has a new buzz word he is using "Generation V" . He sees that there is a new class of people, a generation not bounded by age or birth - but instead bounded by the experience of using computers. This is an interesting idea. There are people who are active users of the Internet, blogging, social networking tools, and so on. There are other people who are not, and likely never will be. This set of differences is larger than the similarities that might exist across a age group.

Any other opinions on this concept?

Monday, April 28, 2008

Improving internal search

I think that it is critical that our people have tools to find information inside and outside our organization. The more high quality information we can place in the hands of the practitioner, the better they can do their job. I believe that intranet search is an area where we can improve on this capability. The first step towards improving is to measure where we are today. If we do not know where we are, we cannot know if we have improved. We have taken that baseline. We have started to "tweak" our environment to improve on that baseline. As we prove out improvements in the lab, we are moving those out to production as Beta implementations. As improvements are proved out in Beta they will be moved into production search.

How did we take that baseline? The standard for measuring search is based on a subjective, human understanding of relevance. The standard was set by the NIST. The NIST holds an annual event where they test and tweak search. They use a standard set of content, fairly small. They have experts evaluate the content and determine the ideal documents within that standard set. They then use complex queries supplied by those experts to bring back documents from the search engine. They measure how many documents within a number of results are from the ideal set - precision. They measure how many ideal documents can be found by the search engine - recall. This seemed like a reasonable process, so that is how we measured our impact on the search engine.

We established a test environment – a new instance of the search engine, indexing the production content. We asked for volunteers to act as our experts. We asked them to establish an area of expertise for themselves. We then had them identify the "top" 25 documents within that area of expertise, given a query that they suggested. This became our ideal set. We used that ideal set to measure precision and recall at 3, 10 and 25 results.

Usually search engine vendors apply more than one approach to search and combine them together to create a relevance ranking. Generally, the various approaches to calculating relevancy - Vector, Probability, Language Model, Inference Networks, Boolean Indexing, Latent Semantic Indexing, Neural Networks, Genetic Algorithms and Fuzzy Set Retrieval - are all capable of retrieving a good set of documents with a query. On a similar set of content, with a similar query, they will each retrieve the same documents in about the same order. The difference comes in with the work done to "tweak" the engine to particular content collections.

One thing that can improve search relevance is a set of known good documents for a particular query presented either as a "Best Bets" or used behind the scenes as a "golden sample" to refine relevancy. This presupposes that your queries follow a Zipf distribution and you can target most of your users by tuning a relatively small set of queries. Unfortunately our current search logs do not follow a Zipf distribution. Our top 100 queries do not represent 10% of our total queries. To reach a 80% penetration rate we would need to generate 20,000 separate "golden samples". This is a tuning method that is dependant on the Zipf distribution to make it effective. When your logs are flat, like ours are, it does not scale.

We are also investigating other search technologies and evaluating them on our current content to see how well they retrieve documents with the same queries. What we are seeing is that these other search engines are not significantly better against our content than our existing implementation, according to our measures of precision and recall. What other search engines offer is significantly improved tools for tweaking the search engine based on our content collection.

I hope all of the above has been of interest to you. I have found it interesting to compare vendor claims to reality as we have been working through these projects. The difference is quite large. Nothing is ever simple, or easy. I'd rather have a "silver bullet", but recent evidence shows that one does not exist. The only silver bullet is to invest hard work, pay attention to the details and persevere. I think improving search is worth hard work, attention to detail and perseverance.

Friday, March 14, 2008

Response time effect on our users with Search

Looking across the industry, we see some consistant information about system response time.

Generally speaking, when a page takes more than 8 seconds to launch there is a significant impact on the users perception of quality.

Neilsen says in this article:

The basic advice regarding response times has been about the same for thirty years [Miller 1968; Card et al. 1991]:
0.1 second is about the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.
1.0 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. Normally, no special feedback is necessary during delays of more than 0.1 but less than 1.0 second, but the user does lose the feeling of operating directly on the data.
10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.

So - what is your ideal search response time?

Friday, February 8, 2008

Usability Techniques

We have a number of core techniques we employ in our usability program. This post is a brief overview of the techniques. In future posts, I'll dwell on each technique in more detail and discuss how we employ each on a project. The three core techniques are Card Sorting, Expert Review and Lab Testing.

Card Sorting – uncovering the mapping of the computer display of information and the user’s conceptual model of the information; each concept is written on a card and the users sort the cards into piles.

Expert Review – a formal review conducted by usability specialists according to common, pre-established usability principles.

Lab Testing – while being observed by usability specialists, users attempt to complete scripted tasks, which take advantage of the functionality of the system.

What do users want from a search engine?

This post is going to be just a quick bulleted list of what I know my users want from a enterprise search engine, based on my study of our search logs, usability studies, profiles, recent research in the field, information retrevial studies and some conference presentations. If you have additional suggestions, please include them as comments.

A single search box, persistently placed on all pages. Wide enough to avoid typos.
Google - everyone knows google and wants internal search to be google. This doesn't mean the google appliance. This means quick, accurate and comprehensive search. Quick means sub 5 second response time. Accurate means the right document in the first 3 documents. Comprehensive means every possible document - regardless of which firm silo created it, regardless of which technology holds it, regardless of if "it" is a word document, a zip file, a email, a document on their hard drive or a documentum folder.
Some kind of advanced search, even though they will not use it.
An ability to narrow the search to specific areas of content that is contextual to them. Sometimes this is content types, like Policies, People, Sites. Other times this is my country, my service line, my language, my industry - taxonomy, but without having to call it taxonomy.
For the tool to allow them to type in "How do I do an internal audit?" and bring back documents on audit methodology. This doesn't have to mean natural language queries. If you ignore the "How do I do an" and the "?" that query is "internal audit". The system has to ignore ? and "How do I do" to run correctly. But it does mean that there needs to be a relationship between internal audit and audit methodology.
People expect the system to find things using an - not a PHRASE and not a . People looking for Risk Management in Technology Companies in the UK type in Risk Management Technology UK. The tool needs to understand that behavior and correct for it.
People expect that the system will correct their spelling.
People expect that documents with some of the words they searched for in the title will be more relevant. That documents with some of the words in the summary will be more relevant. That the system will know that TAS and Transaction Advisory Services are the same exact thing, even though they are not to the computer. People expect that more recent documents will be relevant, except when they are looking for older documents - and they want the tool to know the difference.

Those are some of the things that I know people want from search. What do you want from search?

How do you evaluate a enterprise search engine?

I am not going to claim this is the only way to evaluate a enterprise search engine. But, early on in my search program we started thinking about how we could measure our progress with search. How could we tell if a search engine tweak caused the results to improve, or get worse, or stay the same? Seems like a simple question, no? How do you evaluate a search engine?

Companies have been selling search engines for decades. IBM started with a product called STAIRS in 1960. Given that long history, you would think there would be a simple answer - do x, look at y and if it is larger than z, you have a good search engine. Evaluating a search engine is actually a complex question. Search is a very context sensitive behavior. In a knowledge environment, the documents that are of interest to you are not the same as the documents that are of interest to me. Search really can only be evaluated within a specific context for a specific user.

There is, however, a standard for search evaluation. The standard is based on a subjective, human understanding of relevance. The standard was set by the NIST. The NIST holds an annual event where they test and tweak search. They use a standard set of content, fairly small. They have experts evaluate the content and determine the ideal documents within that standard set. They then use queries supplied by those experts to bring back documents from the search engine. They measure how many documents within a number of results are from the ideal set - precision. They measure how many ideal documents can be found by the search engine - recall. This seemed like a reasonable process, so that is how we measured our impact on the search engine.

We established a test environment – a new instance of the search engine, indexing the production content. We asked for volunteers to act as our experts. We asked them to establish an area of expertise for themselves. We then had them identify the “top” 25 documents within that area of expertise, given a query that they suggested. This became our ideal set. We used that ideal set to measure precision and recall at 3, 10 and 25 results.
We also established another way of measuring the impact of our changes. We asked for real users to tell us how we are doing – have we improved, declined or stayed about the same - by creating a Beta site.

From this testing we determined that we could improve relevancy, and that these improvements would be noticible to the end user.

What is usability?

Usability is a process for measuring and improving the user's experience with a web site or application. Within the KWeb team, Usability begins with a focus on users’ needs, tasks, and goals. Usability requires that you change your mind set, and spend more time on initial research and requirements. Instead of starting with the design of the system, you start by identifying your target audience and observing them as they use an application to accomplish their tasks. You use this research to identify what the users needs are, what they want to accomplish, how their environment affects their behavior and identifying their priorities.
This process creates an emphasis on iterative design process. You need to develop applications in a prototype form, test the application with the end users – have them try to accomplish tasks using your prototype. Once you see the pain points, change the prototype to address those pain points and test again. Through a series of iterations you develop a product that solves real end user’s problems in a simple and intuitive way.
One way to describe a usable system is that it is easy to learn, easy to use, easy to explain and hard to forget. Think of a tool, software or hardware, that you enjoy using – which works well. Usability is a way of developing new tools that hit that sweet spot. It all comes back to measures – in order to make effective change you have to know where you are, know the effect of your changes and have a target goal in mind. We target improvements in task completion, task time, user satisfaction as measured by the System Usability Scale and a reduction in way finding errors.

The great search experiment

Back in 2007 I tried an experiment with some coworkers to see how well other search engines worked. We wanted to know, from experience, how well faceted search, guided navigation, search presentation layers and so on were in day to day life. If you are interested, you can see the entire experience here: http://30daysgooglefree.blogspot.com/. In the end, we determined that if the search engine gives you a good document on your first page of results, you won't need these other search tools. If it does not, then you will not see a great improvement.

I was a bit disappointed, really.