Wednesday, November 12, 2003

Viewing the Boundaries of Oral Culture

Television is a major information source. Until my wife banned it from our household, my television was on all the time. When considering the mass media as an information source, it becomes important to consider a number of different issues: who uses mass media? How is it used? How do we make sense of it?

When I think of mass media I imagine a continuous line stretching through the ages. At the start of this line our Palaeolithic ancestors used the same form of mass media as more contemporary bards and seanachies: oral culture. Over the last few centuries we’ve seen an increase in the number of mass media types as newspapers, magazines, radio, and television have inundated our homes. The line ends today with the form of mass media that I most depend on—the Internet.

To understand the role of mass media it is perhaps useful to start at the end of this line and explore how people use the Internet. Hektor (2002) provides a taxonomy of factors effecting how people use the Internet as an information source: social and physical environment, setting of the information, the information activities of the individual, and the outcomes of the information activities. While Hektor’s model is compelling and his arguments are rhetorically attractive, it remains unclear how he determined his concepts. Unfortunately, his paper may merely be so much yak-yak rather than valid observation. Regardless, his concepts may be applicable to other mass media types along our mass media line.

Television provides an interesting example of an information source largely because many of Hektor’s factors are controlled when viewing television. The information is generally produced for consumption in the living room rather than in other settings and the information activities of the consumer is fairly static (i.e., viewing). It should be noted, however, that we often introduce Hektor’s factors into our television viewing. Watching pay-per-view professional wrestling events, for example, are considerably more entertaining when we construct a social environment by inviting friends over for beers. Similarly, television is sometimes used outside of the living room. Equity trading floors often have several channels tuned to a variety of different news feeds (Knorr Cetina & Bruegger, 2001).

Television—and news programs in particular—do more than merely provide information that may or may not inform us. I recently went to a family luncheon with my parents, my brother and sister-in-law, and my sister-in-law’s parents. The only thing that this collection of people seemed to have in common was Coronation Street. During the luncheon, the characters and themes of this soap opera became a form of oral culture (Fiske, 1987, 1989) that enabled the participants to overcome the communication boundaries that they were experiencing. After overcoming these boundaries, the participants were able to use each other as informal (Weedman, 1992) information sources. Two forms of mass media—television and oral culture—were required to negotiate this transition.

The nature of boundaries between communities is an interesting topic. Although this topic has been briefly addressed in the LIS literature (e.g., Weedman, 1992), sociology literature has addressed this topic at length. Wenger, for example, introduces the concept of communities of practice (Wenger, 1998), Weick explores the interactions of loosely and strongly coupled systems (Wieck, 1995), and Bowker and Star explore how individuals and organizations negotiate across these boundaries (Bowker & Star, 1999). An issue that hasn’t been addressed, however, is how these boundaries occur. Are they developed through socialization? Is linguistics the only important factor? What would Marx say?

Perhaps our automatic assumption is that boundaries occur due to power discourses and patriarchal hegemonic socio-economic forces. Chatman (1985), however, provides some relief from this world-view. In her study of the working poor, she expected that most of the women she met would use television as their primary information source. Instead, she discovered that most used print sources. Although Chatman’s arguments cast some doubt on strict hegemonic interpretations of mass media consumption , they also validate some of Hektor’s factors. Chatman admits that her sample may be atypical due to their relatively high education level although the information activities and outcomes of her sample would likely be the same as for a sample with less education. The social and physical environment of the information, however, is considerably different since Chatman’s sample consisted of individuals who had experienced post-secondary socialization and had access to a university.

I can think of a similar example from my own experience. While doing development work in a small and very poor Nicaraguan town, I observed that television and poverty seemed to go hand-in-hand. In retrospect, I realize that there were no books in the town. The English books that I had were met with considerable reverence even though the Nicaraguan’s couldn’t read them. Perhaps the Nicas’ consumption of television had more to do with their physical setting—a town with no access to books—and their social setting—extreme poverty—than with their desires or preferences.

In closing I want to consider a comment made by Fiske: “The art of popular culture is the ‘art of making do’. The people’s subordination means that they cannot produce the resources of popular culture, but they do make their culture from those resources.” I wonder if people are starting to make their own popular culture. What is the role of new technologies like wikis or blogs. I especially wonder about the popularity of text messaging among youth. Surely text messaging isn’t mass media but an entire culture is growing around messaging that includes its own semantics, syntax, and socialized codes of conduct. Are the ‘subordinated people’ finally creating their own culture? Is it available and applicable across boundaries?

I just don’t know.


Bowker, G. C., & Star, S. L. (1999). Sorting Things Out: Classification and Its Consequences. Cambridge, MA: MIT Press.
Chatman, E. A. (1985). Information, Mass Media Use and the Working Poor. Library & Information Science Research, 7, 97-113.
Fiske, J. (1987). Television culture. London ; New York: Methuen.
Fiske, J. (1989). Reading the popular. Boston: Unwin Hyman.
Hektor, A. (2002). Information Activities on the Internet in Everday Life. Paper presented at the Information Seeking in Context, Universidade Lusiada, Lisbon, Portugal.
Knorr Cetina, K., & Bruegger, U. (2001). Transparency regimes and management by content in global organizations. The case of institutional currency trading. Journal of Knowledge Management, 5(2), 180-194.
Weedman, J. (1992). Informal and Formal Channels in Boundary-Spanning Communication. Journal of the American Society for Information Science, 43(3), 257-267.
Wenger, E. (1998). Communities of practice : learning, meaning, and identity. Cambridge, U.K. ; New York, N.Y.: Cambridge University Press.
Wieck, K. E. (1995). Sensemaking in Organizations. New York: Sage.

Monday, November 10, 2003

Questions of Research
For several weeks I’ve been reaping texts on research design methodologies and separating the chaff. I now have mountains of intellectual grist in my brain and I feel like I had better actually do something productive before it begins to rot. I’ve decided to write down some research ideas.

The Impact of Zipf

I recently listened to a lecture about bibliometrics and learned about the wonders of Zipf’s law and its scholastic brethren: Lotka’s Law (authors), and Bradford’s Law (Journals). I wonder, with the rise of specialty journals and increased publishing pressure, if Zipf’s law has begun to come off the rails. Do Lotka’s and Bradford’s Law still apply? How do they relate to ISI’s impact factors? If Zipf’s law still does hold, could it be used to discern impending fragmentation of a particular discipline?

Socio-Economic Book Clusters

I’ve been using a library for as long as I can remember. I depend on my public library for three particular genres: science fiction, self-help financial, and pop-guru management (I wouldn’t actually buy these things!). Is there any way that someone could predict my book consuming habits? During my training as a librarian I became aware of the “fiction problem” and learned that book consumption habits can’t be predicted… or can they. It would be interesting to cluster book similarities based on patron borrowing patterns. In addition, if the patrons used branch libraries the clusters could be referenced against the socio-economic conditions of the surrounding neighbourhood. Social profiling may be odious but it would make reader’s advisory much simpler!

Text Obsolescence

Some texts never seem to expire. Others—the Koran or the Book of Mormon, for example—seem to be getting ever more popular. How long do texts last? When is a text obsolete? Citation analysis could yield some very interesting patterns for particular disciplines or fields of study. Implementing a form of content analysis could further enrich the data source. What type of text lasts longer based on citation half lives: monograph or journal article? What type of monograph lasts longer: a description for laymen, an erudite and witty explanation, or a detailed tome.

Information Theory and Technical Indicators

I am a veteran of the dot bomb era. During those years I tracked my stock religiously and believed that my stock options would make me rich. I was even involved in day trading and the use of “technical indicators” and other kabbalistic devices to guide my stock picking decisions. Did these techniques really work? I’m not sure but I’d like to know. One possible analysis involves the use of Information Theory to determine signal/noise ratios. For my particular experiment, I could use a 10-day indicator like the MACD and compile noise calculations for a 10-year period of time that covers the dot bomb era, and the recent wars. The calculations would be based on the probability that the indicator correctly signalled the purchase over a one-month period of time as compared to a moving baseline covering a six-month period. I hypothesize that speculative booms increase the channel noise. Industries will behave differently and certain indicators will be better than others. If I’m investing in high tech, how can I know how noisy the price signal is? Which indicators should I use: Bollinger bands, stochastics, MACD, etc.? Although I think this issue is pretty cool, it also has applications in the LIS world. A similar process could be used to determine the efficiency of n-grams when querying genetic databases. From a media studies perspective, a similar process could be used to discuss Hayek’s assertions regarding price structures and the communication (or lack of communication) of complex information.

On to Content Analysis…

It was interesting to read about content analysis. I feel that content analysis is an important asset for the researcher’s toolchest. I am, however, concerned about the subjective nature of the coding decisions made by the researcher and the difficulties of constructing an “exhaustive and exclusive” classification scheme… blah, blah, blah.

For further discussion of content analysis limitations—and the limitations of research in general—please consult the archives of my postings. I’m getting the itch to move and actually start doing some research rather than criticizing and commenting on the validity of the various techniques.
Steps for Google…

The search engine wars are raging. Who will come out on top: Teoma, AllTheWeb, Vivisimo, AltaVista, or—the perennial favorite—Google. They are all pretty good. I can generally find what I’m looking for… it’s hard not to find what you’re looking for when you get 2,600,000 hits!

Recall is obviously not a problem with search engines but precision has become a huge issue. Despite their limitations, the search engines have already adopted many of the standard IR techniques like lexical analysis to treat punctuation, the elimination of stopwords, and text compression to improve performance. The search engines, however, are missing some of the major aspects of preprocessing identified by Baeza-Yates. They don’t, for example, seem to perform any stemming to determine word roots nor do they utilize a controlled vocabulary to improve precision. Although certain directory services such as the Google Directory, DMOZ, or Yahoo! provide certain aspects of controlled vocabularies, their categories are neither exhaustive nor exclusive like formally engineered thesauri and controlled vocabularies.

Where Google excels is in determining similarity between documents and queries. The PageRank method combines elements of the vector space model with bibliometric techniques. The similarity weight of various documents is adjusted by a “credibility” rating determined through co-citation hyper-linking patterns.

Although Google works well, it certainly could be improved. Google automatically indexes documents and often key concepts are omitted from descriptions. When using the OCLC interface for SocAbs, for example, the user can select from various document types such as journal articles or books. Other than allowing the user to filter results sets by format (e.g., Powerpoint, PDF, etc.), this functionality is missing from Google. In addition, Google provides no provision for determining the real validity or credibility of a document as determined through a formal peer review process.

Perhaps the one area where Google—and all of the other major search engines—could be improved through the use of relevance feedback. Why am I not allowed to select a set of documents and then ask the engine to provide similar documents? As it is, Google only allows me to find similar documents based on a single document and not a set of documents.

Similarity searching based on a set of documents also may introduce another facility to Google. Since no formal controlled vocabulary is used, related terms and inherently invisible to the searcher. Google provides no means to expand a search with synonyms. Perhaps through analysis of the most important—or information heavy—words in documents, the searcher could determine previously unknown keywords.

RESEARCH THOUGHT: Salton and McGill introduce the SMART retrieval system. In their discussion, they provide a description of clustering methods and illustrate the ability to determine centroids for clusters. It would be interesting to compare the clusters resulting from automatic indexing to formal LCSH headings. One could use the OPAC as a sampling frame. Only book entries from the last four years (because they contain detailed abstracts, blurbs, and TOCs) will be considered. The sample could be drawn from particular LCC codes such as engineering or even Z- library and information science.