Wednesday, September 24, 2003

Vivisimo, Google, and Yahoo… Oh my!

A stack of papers is clustered around me, scratching my monitor’s chin. In order to gain some insight on what they actually say I’ve decided to break them on the rack of personal experience.

The other day I was talking to my brother on the phone. As part of the conversation he asked me: “What search engine do you use?” I use several different engines depending of the issue I’m researching. I realized, however, that the unasked question was: “What search engine should one use?” Perhaps my recent readings can cast some light on this question.

In addressing the question of search engine use we have an interesting question. In perhaps few other information seeking activities can we find such a clear example of “information-as-thing” (Buckland, 1991) that adhere to the Shannon-Weaver model of information transfer. Cognitive models of information use should certainly apply. Frohmann’s (Frohmann, 1992) analysis of our field’s conception of information as a commodity, however, lurks in the back of my mind. In the example of my brother searching the web, where are the institutions and power structures?

Although my brother isn’t incarcerated, I’m reminded of Chatman’s work on the information practices of prisoners (Chatman, 1999). In her study, she discerned four factors that govern information practices: small world, social norms, worldview, and social types. Do these same characteristics exist in larger communities “in the open” such as Wenger’s notion of Communities of Practice (Wenger, 1998)? In some ways, my brother is beholden to particular views, language, and stereotypes of his profession and environment. As Dervin indicates, our information practices come out of our social practices (Dervin, 1992; Dervin & Nilan, 1986) yet these issues are patently ignored by most search engines.

Google presents a world where anyone can search for anything. Most people, however, won’t search for anything but rather what they know and will explicitly ignore those things that disrupt their “Way of Life” (Savolainen, 1995). Perhaps tools like Teoma or Vivisimo, which utilize clustering technology, help people to better understand the information space that are presented by search engines. The cluster names themselves, however, are mere products of our semantically unstable language so the user is likely to encounter a “spin-out stop” (Dervin, 1992) when the choices presented don’t match their personal mental models of the information space.

Although the web may be a difficult tool to use and few information professionals actually use it (Bates, 1999), it is still very popular. Perhaps one reason for this popularity is the density of forage and prey (see Sandstrom, 1994 for discussion). Optimal foraging models, however, reveal some real limitations to search engines as information seeking tools. One aspect of foraging models is the marginal utility functions between search times and handling times. Although the web provides the user with very quick search times that reveal abundant information, the user is forced to spend an inordinate amount of time processing this information to find valuable or useful documents. Another aspect of the models involves between-patch travel time and within-patch foraging time. Although the web appears to be one very big patch, it is actually a collection of disparate patches broken out across various domains, sites, and document types. Furthermore, these patches remain unbroken by the domesticity of controlled vocabulary to improve the ease of searching within particular patches (see discussion of controlled vocabulary in Star, 1998).

Although I’ve gained some understanding of the difficulties presented by the web as an information source, I’m still not any closer to answering my brother’s question. Falling back to the cognitive paradigm I can analyze particular aspects of search engines in relation to the existing models of information seeking (Wilson, 1999). Using Kuhlthau’s stage process model we see that all search engines are very good at starting searches and extracting information but are quite ineffective at chaining searches together or allowing users to truly differentiate between documents. Wilson’s 1996 model is based largely on psychological or internal user factors that search engines are completely indifferent to. Other models apparently closely mimic search engine functionality e.g., Saracevic’s and Spink’s. It should be noted, however, that these are inherently IR models and subject to all of the shortcomings of the cognitive approach (Frohmann, 1992). In none of these models can I find the guidance I need to decide between Google, AllTheWeb, Teoma, Altavista, or Vivisimo.

In my analysis of search engines, I have complete ignored one aspect: passive information seeking and the process of monitoring information (Chatman, 1999; Savolainen, 1995; Williamson, 1998). In this regard, search engines truly suck. I have often decided to surf the web as a means of procrastination only to find that I couldn’t find anything I wanted to read. As a passive medium, Google only offered some bright bubble letters to look at rather than a source of potentially useful information. Indeed, I typically rely on public radio for news and general information rather than the web. Certain exceptions do exist. I have relied on Yahoo! for a number of years to provide my personal email. Although their mail client is terribly designed I find that regular visits to the customized myYahoo! portal home page a wonderful source of news that I may otherwise miss. In this regard it seems unfortunate that web portals are out of vogue.

After all of this, I still don’t have any solid recommendations for my brother. Perhaps in the Darwinian web environment we will see new technology to support information seeking emerge from the primordial muck. Hopefully one of them will finally validate one of our information seeking models.


