Facetation: 10/01/2006

Friday, October 06, 2006

I’ve recently completed a quick study to explore the nature of technical handbooks. The purpose of the study was to start to unpack the meaning of the word “handbook.” This genre is particularly immune to critical study despite its continued importance for professionals such as architects and engineers.

There are various definitions for handbooks. My main concern is not how librarians or publishers define and describe them, but rather how actual users describe them. One source of information is published reviews of different handbooks. These reviews are typically written by engineers and other technical professionals to be read by their colleagues.

Sample

This study focused on the contents of reviews of technical handbooks. The sampling frame consisted of all book reviews available in the ProQuest ABI/Inform—Trade and Industry database. The sample consists of reviews that contained the keywords “handbook” and “engineer” and that explicitly review a technical handbook to be used by engineers or other technical professions.

A total of 188 reviews were selected for analysis. The shortest review is 47 words. The longest is over 1,300 words. The average review length is 350 words. The reviews represent a wide range of technical handbooks, such as:

Composite Materials Handbook
Concrete Technology
EMI/EMC Computational Modeling Handbook
Handbook of Computer Simulation in Radio Engineering, Communications and Radar
Handbook of Engineering Electromagnetics
Handbook of Filter Media
Handbook of Material Weathering
Handbook of Polypropylene and Polypropylene Composites
Handbook of Powder Science and Technology
Materials Handbook: A Concise Desktop Reference
McGraw-Hill Machining and Metalworking Handbook
Perry’s Chemical Engineers Handbook
The Pilot Plant Real Book: A Unique Handbook for the Chemical Process Industry
Standard Handbook of Consulting Engineering Practice

These reviews come from a variety of different trade journals, such as:

Chemical Engineering Progress
Civil Engineering: Magazine of the South African Institution of Civil Engineering
Electrical Apparatus
Electromagnetic News Report
Information Intelligence Online Newsletter
Mechanical Engineering
Microwave Journal
Modern Machine Shop
Pit and Quarry
Plastics Engineering

Analysis

I saved a digital version of each review and removed all header material. I then processed the reviews using TextSTAT, a text analysis program available from the Free University of Berlin (http://www.niederlandistik.fu-berlin.de/textstat/software-en.html). The TextSTAT analysis yielded raw frequency counts for each word that appeared in the sample corpus.

The raw counts are meaningless. While certain words seemed to occur quite frequently in the corpus, their overall prevalence in the English language must be considered. The value I specifically explore is a log transform of the ratio of the corpus frequency compared to the language frequency:

G = log (frequency_corpus/frequency_language)

To determine English language word frequencies, I used the frequency lists available on the companion website for Leech, Rayson, and Wilson’s Word Frequencies in Written and Spoken English (http://www.comp.lancs.ac.uk/ucrel/bncfreq/). I calculated G values for the 840 terms that appeared more than twice in the entire corpus. Of these terms, 62 are very uncommon in written English (i.e., less than once in a million words). Leech, Rayson, and Wilson list the frequency for these terms as 0, making a G calculation impossible.

Results

My analysis included over 840 terms. The G values for these terms are normally distributed and ranged from high value of 2.76 to a low value of -1.90. The values were normally distributed (mean = 0.64, SD = 0.77) with minimal distortion (kurtosis = -0.16, skewness = 0.22).

I isolated three particular groups of terms that yield some insights on how technical handbooks are understood by their audiences. The first is the list of terms that appear frequently in the corpus yet have sub-threshold frequencies in written English. The second list consists of terms that appear unexpectedly frequently as determined by a G value greater than two standard deviations higher than the mean value. I determined the cutoff to be 2.17. The third list of interest consists of words that appear unexpectedly infrequently as determined by a G value less than two standard deviations lower than the mean value (-0.89). The differences in the lengths of lists two and three are due to the skew of the distribution.

List 1: Terms that appear very frequently

Term	Corpus Count (per 65,501 words)
Brookfield	55
McGraw-Hill	35
modeling	28
parts	27
hardcover	25
molding	25
CDMA	24
Wiley	23
cryogenic	22
fibers	20
Dekker	20
Grossel	20
media	19
ASM	19
ASTM	17
Boca	16
EMC	15
models	15
molds	15
nondestructive	15
polymerization	15
made	14
RF	14
e-mail	12
flowmeters	12
SPE	12
Artech	11
CA	11
optimization	10
ASTs	10
dryers	10
machining	10
sizing	10
UST	10
ASME	9
designing	9
steels	9
USTs	9
elastomers	8
OnDisc	8
AIChE	7
exchangers	7
extrusion	7
FSU	7
troubleshooting	7
pertaining	6
actuators	6
Bashore	6
Begell	6
CNC	6
coupling	6
criteria	6
Elsevier	6
Gating	6
IS-95	6
Loctite	6
molded	6
practicing	6
scale-up	6
wastewater	6

List 2: Terms where Gz > 2

Term	Corpus Count (per 65,501 words)	English Frequency (per Million words)	G
handbook	517	6	3.16
piping	38	1	2.76
fluid	35	1	2.73
manufacturing	61	2	2.67
bookshelf	27	1	2.62
composites	27	1	2.62
CRC	25	1	2.58
anonymous	224	12	2.45
appendices	16	1	2.39
covered	45	3	2.36
terms	15	1	2.36
mechanical	297	20	2.36
combustion	44	3	2.35
electromagnetic	44	3	2.35
corrosion	43	3	2.34
academia	14	1	2.33
filler	32	3	2.21
engineering	533	51	2.20
mixing	62	6	2.20
sensors	20	2	2.18
according	10	1	2.18
polypropylene	10	1	2.18
studies	10	1	2.18

List 3: Terms where Gz < -2

Term	Corpus Count (per 65,501 words)	English Frequency (per Million words)	G
still	6	749	-0.91
have	98	13,655	-0.96
just	9	1,296	-0.97
know	13	1,883	-0.98
very	7	1,230	-1.06
when	8	2,143	-1.24
with	17	6,575	-1.40
his	7	4,334	-1.61
he	7	8,470	-1.90

Facetation

Friday, October 06, 2006

About Me

Links

Previous Posts

Archives