I often read journal articles and wonder if I’ll ever be able to produce a work of equivalent brilliance. This week I had no such problem. Indeed, I have to ask myself what the editors were thinking!
Singleton (1988) discusses threats to experimental validity. Two different articles provide some great insight into exactly what this means. Chui and Dillon give us Who’s Zooming Whom? (Chui & Dillon, 1997) while Grabe et. al. offer Packaging Television News (Grabe, Zhou, Lang, & Bolls, 2000).
Maybe I should start with some general observations on experimental design. In Grabe et. al. we find hypotheses that seem impossible to operationalize, unaccounted order effects, a boggling method statement, low statistical power, and the use of self reported measures that are likely to reflect socio-economic biases and expected norms. Who isn’t going to claim that tabloids are trashy and less enjoyable?!? The authors used self-report measures when empirical testing devices are available. “Enjoyability”, for example, could have been operationalized using something like Paul Ekman’s Micro Expression Training Tool, or METT (see Zetter, 2003). The authors’ greatest gaff, however, is the provision of meaningless details that shroud the research with a semblance of authority. Details such as the use of an AVID video editing station or “Beckman AG/AGCL standard electrodes” seem out of place in an article that fails to explain how key concepts such as “Informativeness”, “Believability”, or “Enjoyability” are operationalized. I’m reminded of Kevin Siembieda’s comments in the instruction manual for the Robotech Role-Playing Game:
“In translating Japanese text about the Marcross T.V. Series, we unearthed a wealth of specific names and numbers for missiles, like the GH-30 or GA-95. It seems the Japanese love to name everything. Unfortunately, hard data… was limited…” (Siembieda, 1986 pg. 37)
Chui and Dillon, however, give us something else. I’m amazed that they bothered to publish at all. They provide a litany of results and then append each observation with “but the results are not statistically significant.” My question is: why bother reporting them? Many of their results are actually far from significant (e.g., p=0.388 or p=0.626… yikes!). When they finally do run an ANOVA on their scanty sample they find that they account for much of their variance by the level of experience the users have with the Mac OS. Basically, they empirically validated that after people develop expertise with a system they experience better results when using the system than people who have not developed expertise.
As OGS and SSHRC deadlines approach I’m thinking of doing some research of my own. After some preliminary investigation I plan to conduct “post hoc” analysis to determine that children really do prefer ice cream to brussel sprouts.
Chui, M., & Dillon, A. (1997). Who's zooming whom? Attunement to animation in the interface. Journal of the American Society for Information Science, 48(11), 1067-1072.
Grabe, M. E., Zhou, S., Lang, A., & Bolls, P. D. (2000). Packaging television news: The effects of tabloid on information processing and evaluative responses. Journal of Broadcasting & Electronic Media, 44(4), 581-598.
Siembieda, K. (1986). Palladium Books presents--Robotech : the role playing game. Detroit, Mich.: Palladium Books.
Singleton, R. (1988). Approaches to social research. New York: Oxford University Press.
Zetter, K. (2003, September 2). What a Half-Smile Really Means. Retrieved September 28, 2003, from http://www.wired.com/news/culture/0,1284,60232-2,00.html