Friday, December 03, 2004

Newsgroup response: SNA and IR for email corpus

Hi Kelly:

To answer your question, no I haven't yet built an enterrpise-class implementation of my proposal. Smaller models work by my proposal has a few limitations:

1. People and organizations can be quite picky about email. While employers generally have the right to monitor their employee's email (there are a number of recent attempts to legislate informed consent, etc.), few companies actually like to do it and few organizations like tampering with their email servers since they represent the communication life-blood of their organizations.

2. One of the limitations to my proposal lies in the structure of social networks. While combining IR and SNA may allow one to determine networks or communities related to particular key words, this information is of dubious value. The first limitation is IR-related: distinct vocabulary is an important feature of homophilous social groups. An outsider, therefor, will not have access to the vocabulary actually used by members of a particular community. The second is SNA related. With social networks, one is generally concerned with the bridges between distinct communities rather than the links within a community. From an information theory perspective, a group of like minded individuals is likely to produce relatively homogeneous information so the value of that information is low. More important is the source of novel information that comes from other communities through the existence of "weak ties". Operationalizing the concept of "weak ties" in a quantitative manner is exceptionally difficult.

3. Interpretation of the representation of a social network is also quite difficult. Social networks are generally represented as node and link diagrams. These diagrams can be generated through the use of a variety of methods such as multi-dimensional scaling or spring-embedded algorithms; However, devining what the diagrams actually mean can be quite complicated. The diagrams can be used to test or develop hypotheses but they likely wouldn't support a process of general sense-making for the average user.

4. Newsgroups--while very convenient for research--aren't nearly as applicable for SNA as email. Email is practically an ubiquitous tool within many organizations while newsgroups are somewhat novel backwaters of information. Email is an essential component of the ongoing information flow of an organization while newsgoups are... well, not. In a Heideggerian sense, email is ready-to-hand while newsgoups are present-at-hand: useful but requiring conscious thought in use.

From a macro-sociological perspective, we can borrow some concepts from Giddens's theory of structuration and Yates and Orlikowski's notion of genres. In structuration we are both shaped by and shape our social environments through interaction with it. Genres--such as email--play a crucial role in structuration through their limitations and affordances. Email--being ready-to-hand--plays a more important role in the development of social environments (and social networks) than newsgroups. Therefor, in order to understand social environments through SNA email is the best choice.

It could be argued that newsgroups also have a role in structuration. I suspect that their role is considerably different than the mundane email message communicating basic information about ongoing daily concerns. Participation in newsgroups--by no means an ubiquitous activity--probably has to do with more of a Maussian gift economy type of thing. Hence, involvement is more about personal advantage and self-glorification than about the information exchange of email. Why else would I feel the need to demonstrate my knowledge of obscure sociological theory!?

These are just some thoughts. SNA can be a powerful tool. I suspect that it's a better tool for discovering questions than creating answers--caveat emptor.



