Teaching Bits: A Resource for Teachers of Statistics

Journal of Statistics Education v.5, n.3 (1997)

Robert C. delMas
General College
University of Minnesota
333 Appleby Hall
Minneapolis, MN 55455


William P. Peterson
Department of Mathematics and Computer Science
Middlebury College
Middlebury, VT 05753-6145


This column features "bits" of information sampled from a variety of sources that may be of interest to teachers of statistics. Bob abstracts information from the literature on teaching and learning statistics, while Bill summarizes articles from the news and other media that may be used with students to provoke discussions or serve as a basis for classroom activities or student projects. We realize that due to limitations in the literature we have access to and time to review, we may overlook some potential articles for this column, and therefore encourage you to send us your reviews and suggestions for abstracts.

From the Literature on Teaching and Learning Statistics

Research on the Role of Technology in Teaching and Learning Statistics

eds. Joan B. Garfield and Gail Burrill (1997). Proceedings of the 1996 IASE Roundtable Conference, University of Granada, Spain, July 23-27, 1996. Voorburg, The Netherlands: International Statistical Institute. ISBN 90-73592-12-7.

This volume presents the papers of 36 researchers from 13 different countries and 6 continents who met in Granada, Spain, for one week to present papers, view software demonstrations, and discuss issues regarding the use of technology in teaching statistics. Each paper fell into one of five theme areas:

  1. How Technology is Changing the Teaching of Statistics at the Secondary Level

  2. Developing Exemplary Software

  3. What We are Learning from Empirical Research

  4. How Technology is Changing the Teaching of Statistics at the College Level

  5. Questions to be Addressed on the Role of Technology in Statistics Education

Over the five-day period, four broad issues emerged which are addressed by discussion papers presented at the end of each theme area: the need for information on existing software, the changing role of the classroom teacher, the need for good assessment instruments, and directions for future research.

Papers on Statistical Education

ed. Brian Phillips (1997). Papers presented at ICME-8, Seville, Spain, July 14-21, 1996. Swinburne University of Technology, Australia. ISBN 0-85590-753-3.

This is a collection of 16 papers from ICME-8 that are related specifically to statistics education. The first paper is a plenary lecture by David S. Moore titled "New Pedagogy and New Content: The Case of Statistics," as reported by Brian Phillips. The next 12 papers come from Topic Group 9, Statistics and Probability at the Secondary Level, and the final three papers are from the working group on Linking Mathematics with Other Subjects.

Copies can be obtained for $25 (AUS) or $20 (US) by contacting:

Brian Phillips
School of Mathematical Sciences
Swinburne University of Technology
PO Box 218 Hawthorn 3122
Victoria, Australia
e-mail: bphillips@swin.edu.au

"Mathematics, Statistics, and Teaching"

by George W. Cobb and David S. Moore (1997). The American Mathematical Monthly, 104(9), 801-824.

The authors address several questions regarding the role of mathematics in statistics instruction: How does statistical thinking differ from mathematical thinking? What is the role of mathematics in statistics? If you purge statistics of its mathematical content, what intellectual substance remains? The article provides an overview of statistical thinking, contrasts statistics instruction with mathematics instruction, and emphasizes that statistics should be taught as statistics.

International Statistical Review

Volume 65(2) of the International Statistical Review (1997) contains an article by David S. Moore that addresses pedagogy and content in statistics instruction. The article is followed by discussions by noted statistics instructors and members of the private sector. The discussions are followed by a response from Moore.

"New Pedagogy and New Content: The Case of Statistics"

by David S. Moore (1997). International Statistical Review, 65(2), 123-137.

Author's abstract: Statistical education now takes place in a new social context. It is influenced by a movement to reform the teaching of the mathematical sciences in general. At the same time, the changing nature of our discipline demands revised content for introductory instruction, and technology strongly influences both what we teach and how we teach. The case for substantial change in statistics instruction is built on strong synergies between content, pedagogy, and technology. Statisticians who teach beginners should become more familiar with research on teaching and learning and with changes in educational technology. The spirit of contemporary introductions to statistics should be very different from the traditional emphasis on lectures and on probability and inference.


Response by David S. Moore, pp. 162-165.

"Software for Learning and for Doing Statistics"

by Rolf Biehler (1997). International Statistical Review, 65(2), 167-189.

Author's abstract: The community of statisticians and statistics educators should take responsibility for the evaluation and improvement of software quality from the perspective of education. The paper will develop a perspective, an ideal system of requirements to critically evaluate existing software and to produce future software more adequate both for learning and doing statistics in introductory courses. Different kinds of tools and microworlds are needed. After discussing general requirements for such programs, a prototypical ideal software system will be presented in detail. It will be illustrated how such a system could be used to construct learning environments and to support elementary data analysis with exploratory working style.

The American Statistician: Teacher's Corner

The August 1997 issue of The American Statistician presents three papers from the Joint Statistical Meeting in Chicago, August 1996, that address the advantages, disadvantages, rationale, and methods of teaching introductory statistics from a Bayesian perspective. The three papers are followed by discussion and a reply by each of the three authors. Very interesting reading for anyone who teaches with a Bayesian approach or may want to do so in the future.

"Teaching Elementary Bayesian Statistics with Real Applications in Science"

by Donald A. Berry (1997). The American Statistician, 51(3), 241-246.

University courses in elementary statistics are usually taught from a frequentist perspective. The paper suggests how such courses can be taught using a Bayesian approach and indicates why beginning students are well served by a Bayesian course.

"Teaching Bayes' Rule: A Data-Oriented Approach"

by Jim Albert (1997). The American Statistician, 51(3), 247-253.

There is a current emphasis on making the introductory statistics class more data-oriented to motivate probability distributions. However, difficulties remain in communicating the basic tenets of traditional statistical procedures such as confidence intervals and hypothesis tests. Two Bayesian approaches are introduced aimed at helping students understand the relationship between models and data. The Bayesian methods are contrasted with simulation methods.

"Bayes for Beginners? Some Reasons to Hesitate"

by David S. Moore (1997). The American Statistician, 51(3), 254-261.

The author asks, "Is it reasonable to teach the ideas and methods of Bayesian inference in a first statistics course for general students?" This paper argues that it is premature to do so for a variety of reasons.


Individual replies by the three authors are on pages 270-274.

Journal of Educational and Behavioral Statistics: Teachers Corner

"Moving Between Hierarchical Modeling Notations"

by John Ferron (1997). Journal of Educational and Behavioral Statistics, 22(1), 119-123.

Author's abstract: Students studying hierarchical models are often confronted with multiple notational representations. The purpose of this note is to illustrate the relationship between HLM notation and mixed model notation. This is accomplished by explicitly mapping the parameters across notations for a concrete example involving the hierarchical modeling of change.

"A Note on Interpretation of the Paired-Samples t Test"

by David W. Zimmerman (1997). Journal of Educational and Behavioral Statistics, 22(3), 349-360.

Author's abstract: Explanations of advantages and disadvantages of paired-samples experimental designs in textbooks in education and psychology frequently overlook the change in the Type I error probability which occurs when an independent-samples t test is performed on correlated observations. This alteration of the significance level can be extreme even if the correlation is small. By comparison, the loss of power of the paired-samples t test on difference scores due to reduction of degrees of freedom, which typically is emphasized, is relatively slight. Although paired-samples designs are appropriate and widely used when there is a natural correspondence or pairing of scores, researchers have not often considered the implications of undetected correlation between supposedly independent samples in the absence of explicit pairing.

Teaching Statistics

A regular component of the Teaching Bits Department is a list of articles from Teaching Statistics, an international journal based in England. Brief summaries of the articles are included. In addition to these articles, Teaching Statistics features several regular departments that may be of interest, including Computing Corner, Curriculum Matters, Data Bank, Historical Perspective, Practical Activities, Problem Page, Project Parade, Research Report, Book Reviews, and News and Notes.

The Circulation Manager of Teaching Statistics is Peter Holmes, ph@maths.nott.ac.uk, RSS Centre for Statistical Education, University of Nottingham, Nottingham NG7 2RD, England.

Teaching Statistics, Autumn 1997
Volume 19, Number 3

"Earth's Surface Water Percentage" by Roger Johnson

A hands-on activity is used to illustrate standard point and interval estimates of a proportion. An inflatable globe is tossed around and caught, and the student determines if the right index finger falls on water or land. The water and land counts are tallied and used to estimate the proportion of the earth that is covered by water. Since the proportion is more or less known, the accuracy of these estimates can be assessed. A hypothesis test associated with the interval estimate is also given.

"The Binomial and Poisson Distributions" by Paul Hutchinson

The author provides an example that can be used to help students compare and contrast the binomial and Poisson distributions.

"An Intuitive Test of Structural Change" by Joseph Eisenhauer

The author suggests a simple, intuitive approach to testing for structural changes in time series data. A Student's t-test applied retrospectively to forecast errors can signal a structural break in the time series. The procedure is illustrated using historical data from a recent issue of Teaching Statistics.

"Demonstrating Variance using the Muller-Lyer Illusion" by David J. Krus and James M. Webb

The Muller-Lyer Illusion is used to demonstrate that variability is inherent in both experimental design and observation. The author illustrates how experience with the illusion can help students contrast variation due to experimental design with residual variation that is beyond experimental control.

"Statistics for the Terrified, Version 3.0" by Neville Hunt

The author reviews Statistics for the Terrified (v. 3.0), a piece of CAL software aimed at students from non-mathematical backgrounds who want to apply statistics in their chosen field. The author, hoping that the software might provide a lifeline for students struggling with statistics, finds little to recommend the software as either primary or supporting material in introductory statistics.

"Anomalous Sports Performance" by Robert J. Quinn

The article discusses a lesson in which the author uses anomalous performances in sports to motivate exploration of the binomial probability distribution.

"On a Patrol Problem" by Mohammed Ageel

The article describes the application of probability theory and computer simulation techniques to a highway patrol problem.

Topics for Discussion from Current Newspapers and Journals

"The Best Places to Live Today"

by Carl Fried with Jeanhee Kim and Amanda Walmac. Money, July 1997, pp. 133-157.

"Why Money Magazine's 'Best Places' Keep Changing"

by Thomas M. Guterbock. Public Opinion Quarterly, 61(2), 339-355.

"The Ratings Game"

by Thomas M. Guterbock. Philadelphia Inquirer, 8 July 1997, A13.

Places Rated Almanac

by David Savage and Geoffrey Loftus (1997). Riverside, NJ: Macmillan.

This is the 11th year Money has provided a ranking of the country's 300 largest metropolitan areas. Here are 1997's top ten (numbers in parentheses are their rankings from 1996 and 1995):

  1. Nashua, New Hampshire (42, 19)

  2. Rochester, Minnesota (3, 2)

  3. Monmouth/Ocean counties, New Jersey (38, 167)

  4. Punta Gorda, Florida (2, 61)

  5. Portsmouth, New Hampshire (44, 119)

  6. Manchester, New Hampshire (50, 12)

  7. Madison, Wisconsin (1, 16)

  8. San Jose, California (19, 44)

  9. Jacksonville, Florida (20, 3)

  10. Fort Walton Beach, Florida (18, 28)

The annual report receives wide coverage in the press. For example, about 200 newspapers had articles related to this year's study. Not surprisingly, the majority are from New Hampshire papers. Wisconsin papers are also well represented, claiming that the first place rating received by Madison, Wisconsin, last year is still quite valid. One California paper's writer wondered what the weather is like in the winter in Nashua.

In the Public Opinion Quarterly article, Thomas Guterbock critiques the Money 1996 survey from the point of view of a professional survey researcher. He is particularly concerned that the ratings of the cities vary so much from year to year. He notes, for example, that from 1995 to 1996 Norfolk/Virginia Beach jumped from 283 to 117, and Monmouth, NJ, jumped from 167 to 38, while Benton Harbor, MI, dropped from 47 to 249. The correlation between the ranks from 1995 to 1996 was only 0.74.

The Places Rated Almanac also provides a ranking of 351 metropolitan areas in the United States and Canada, updated every four years. The correlation between the ranks for 1989 and 1993 was 0.90, a substantially higher correlation over four years than that observed for Money over one year. The Places Rated Almanac top ten list for 1997, shown below, looks quite a bit different from Money's:

  1. Orange County, California

  2. Seattle-Bellevue-Everett, Washington

  3. Houston, Texas

  4. Washington, DC

  5. Phoenix-Mesa, Arizona

  6. Minneapolis and St. Paul, Minnesota

  7. Atlanta, Georgia

  8. Fort Lauderdale, Florida

  9. San Diego, California

  10. Philadelphia, Pennsylvania

Both Money and Places Rated Almanac base their final ratings on nine demographic traits of a city, technically called "factors." For Money, these are: Economy, Health, Crime, Housing, Education, Weather, Transit, Leisure, Arts. The Places Rated Almanac factors are similar: Cost of Living, Transportation, Jobs, Higher Education, Climate, Crime, The Arts, Health Care and Recreation.

Based on government and private statistical sources, each city is assigned a score between 0 and 100 for each of the nine factors. These scores are attempts to give "objective" ratings for each of the factors. Money assigns Nashua, NH, the following ratings: Economy 92, Health 79, Crime 81, Housing 36, Education 31, Weather 20, Transit 12, Leisure 44, Arts 82. The details of how these scores are combined into a final rating are considered proprietary information by Money.

By contrast, Places Rated Almanac spells out for readers how their ratings were determined. For Nashua, NH, they gave the following scores: Cost of Living 13, Transportation 42, Jobs 33, Education 64, Climate 32, Crime 96, Arts 47, Health Care 46, Recreation 30. Their lower scores for economic and health factors resulted in Nashua's being ranked 221 in their final ranking. Places Rated Almanac weights its nine factors equally to arrive at its final ratings. Readers are invited to determine their personal rating by filling out a questionnaire provided in the magazine.

Money tries to determine a weighting that they feel reflects the preferences of their readers. This is accomplished by a reader poll which rates the importance of 41 subfactors of the basic nine factors. For example, low property taxes, inexpensive living, and low unemployment rating are subfactors of the economy factor. This year the magazine sampled 502 readers who were asked to rate, on a scale of 1 to 10 (10 most important) each of these 41 subfactors. But again, the process is considered proprietary.

Guterbock tried to make some intelligent guesses on missing proprietary information, to deduce the source of the year-to-year variability. He first showed that sampling error could not reasonably account for all of the variation. This led him to suspect that certain particularly volatile factors or subfactors were being heavily weighted. Looking more carefully at the subfactors, he found that Money appeared to be giving the economy factor about twice as much weight as the health factor. Since, in general, the economic factor is more volatile than the other factors, Guterbock concluded that this is what causes the volatility in Money's ratings.

In his Philadelphia Inquirer column, Guterbock observes that Money has invited its readers to go to their Web site (www.money.com) where they can make their own weighting of the nine basic factors to get their own personal ratings. This gave him the opportunity to see what the results of the poll would have been if the factor weights were determined simply as the average of the subfactor ratings. The result is a rating of the cities that is very different from that obtained by Money. For example, Rochester, Minnesota, is first, and Nashua is not even in the first ten. Washington, which was 162 in the original rating, becomes second.

These stories have interesting parallels with the annual college ratings, which are widely watched by students and their parents. These ratings have also come under fire for unreasonable volatility and for creating a "horse race" mentality.

"The Kind of Sweep That is Hard to Come By"

by Murray Chass. The New York Times, 22 July 1997, B9.

Chass reports that the Mets swept a four-game series with the Reds. He views this as a notable achievement because it was only the 12th sweep of a four-game series between two teams out of a total of 79 such series so far in the season. The Mets joined three other teams who had swept two series. Four teams had swept a single series, and the remaining twenty teams had no sweeps.

Chass contrasts this with the apparently easier feat of compiling a four-game winning streak, which already occurred 85 times in the season. He goes so far as to suggest that the two teams who have not managed such a streak (Phillies and Athletics) are perhaps not fit for the major leagues!

Dr. Mitchell Laks provided a statistical response to Chass in a letter to the editor of The New York Times on July 24.

I would like to point out that, statistically speaking, all of the occurrences that Mr. Chass cites in his article are more likely indicative of chance phenomena rather than reflective of any particular prowess of the teams involved. In fact if we were to replace all of the sports teams and games involved by coins and coin flips, substituting the 50% probability of heads or tails for wins and loses, then the exact phenomena cited by Mr. Chass would be reproduced as the expected outcome.

Thus, since in a series of 4 coin flips the probability of the outcome of "either all heads or all tails" is 1/8, it would be expected that in approximately 10 of the 79 four game series there would be a sweep (close to the 12 observed). Mr. Chass records also that 37 times this year the result of a 4-game series was 3-1 and 30 times the result was 2-2. In fact the corresponding expected numbers for random coin flips would be 40 and 30 times. Additionally, the observed distribution of 4 teams with two sweeps, 4 teams with one sweep and 20 teams with no sweeps is close to that predicted by the Poisson distribution governing such phenomena (2, 8, and 18 teams, respectively).

Moreover, on the date of Mr. Chass' article, the average number of games played by each of the major league teams was approximately 97. For this number of games for each team, it can be proven that the expected number of winning streaks of 4 games or greater is approximately 3. (The exact formula for the expected number of 4 game or more winning streaks over N games is (N - 2)/32; thus, for example, for a full season of 162 games we would expect an average of 5 such winning streaks). Thus for the 28 major league teams, we would expect 28 x 3 = 84 streaks, remarkably close to the 85 observed to date. In fact, using the Poisson distribution again, we find that approximately 2 of the 28 teams are expected to have no streaks of at least 4 games, accounting for the bad luck of the Phillies and Athletics.

In a famous article on the "hot hand" in basketball, Tversky and Gilovich pointed out that fans have a hard time distinguishing "hot streaks" from runs phenomena to be expected in ordinary coin tossing. Laks' letter gives one more reminder of how well the coin model can fit in such situations.

"The Psychology of Good Judgment"

by Gerd Gigerenzer (1996). Medical Decision Making, 16(3), 273-280.

The fact that a test for HIV virus can appear to be extremely accurate and yet a person in a low-risk group who tests positive can have only a 10% chance of having the virus has been considered paradoxical. It has become a standard example in elementary probability and statistics classes to show the need for understanding conditional probability. Indeed, this points out that people's lives may depend on their understanding such data. For example, it has been claimed that positive tests for AIDS have led to suicides.

Gigerenzer argues that physicians and their patients will better understand the chance of a false positive result if we replace the conventional conditional probability analysis by an equivalent frequency method. The success of this method is illustrated in terms of an experiment that Gigerenzer and his colleague Ulrich Hoffrage carried out. They asked 48 physicians in Munich to answer questions relating to four different medical-diagnosis problems. For the four questions given to each physician, two were given using the probability format and two using the frequency format.

One of the four diagnostic problems was a mammography problem:

To facilitate early detection of breast cancer, women are encouraged from a particular age on to participate at regular intervals in routine screening, even if they have no obvious symptoms. Imagine you conduct in a certain region such a breast cancer screening using mammography. For symptom-free women aged 40 to 50 who participate in screening using mammography, the following information is available for this region.

Probability format:

The probability that one of these women has breast cancer is 1%. If a woman has breast cancer, the probability is 80% that she will have a positive mammography test. If a woman does not have breast cancer, the probability is 10% that she will still have a positive mammography test. Imagine a woman (aged 40 to 50, no symptoms) who has a positive mammography test in your breast cancer screening. What is the probability that she actually has breast cancer? _____%

Frequency format:

Ten out of every 1,000 women have breast cancer. Of these 10 women with breast cancer, 8 will have a positive mammography test. Of the remaining 990 women without breast cancer, 99 will still have a positive mammography test. Imagine a sample of women (aged 40 to 50, no symptoms) who have positive mammography tests in your breast cancer screening. How many of these women do actually have breast cancer? _____ out of _____

In a classic study by D. M. Eddy (see Dowie J. Elstein (ed.) (1988), Professional Judgment: A Reader in Clinical Decision Making, Cambridge University Press, pp. 45-590), essentially this same question, with just the probability format, was given to 100 physicians. Ninety-five of the physicians gave the answer of approximately 75% instead of the correct answer, which, in this example, is 7.8%.

In the present study, Gigerenzer found that, when the information was presented in the probability format, only 10% reasoned with the Bayes computation

P(breast cancer | positive test) =

(.01)(.80)/[(.01)(.80) + (.99)(.096)] = .078.

For the group given the frequency format, 46% computed the Bayes probability in the simpler form:
P(breast cancer | positive test) = 8/(8 + 99) = .078.
The article discusses some of the reactions of the physicians to even considering such problems. Here are some quotes:
On such a basis one can't make a diagnosis. Statistical information is one big lie.

I never inform my patients about statistical data. I would tell the patient that mammography is not so exact, and I would in any case perform a biopsy.

Oh, what nonsense. I can't do it. You should test my daughter. She studies medicine.

Statistics is alien to everyday concerns and of little use for judging individual persons.

Some doctors commented that getting the answer in the frequency form was simple. A more detailed analysis of this kind of study can be found in the article "How to Improve Bayesian Reasoning Without Instruction: Frequency Formats" by Gigerenzer and Hoffrage (1995), Psychological Review, 102, 684-704.

"How Many People Were Here Before Columbus?"

by Lewis Lord. US News & World Report, 25 August 1997, pp. 68-70.

It is generally agreed that the populations native to North and South America declined drastically after the arrival of Columbus. But estimates of the actual size of the pre-Columbian populations vary widely. George Caitlin, a 19th century artist who painted nearly 600 portraits and other scenes of Indian life, wrote in his diary that in the time before settlers arrived the large tribes totaled "16 millions in numbers." On the other hand, the US Census Bureau warned in 1894 against believing Indian "legends," and claimed that investigations showed "the aboriginal population at the beginning of the Columbian period could not have exceeded much over 500,000."

Although the question is still unsettled, modern estimates are more in line with Caitlin. One view holds that due to lack of natural immunity, most of the native population were wiped out by epidemics of smallpox and measles carried from Europe. Anthropologist Henry Dobyns argues that disease resulted in a 95% reduction in the native population. Assuming that the population North of the Rio Grande had bottomed out when the Census Bureau made its 500,000 estimate, Dobyns multiplies by a factor of 20 to arrive at pre-disease estimates in the 10,000,000 range. (By comparison, this is twice as many as lived in the British Isles at the time.)

Other new estimates are based on a painstaking study of documents such as Spanish reports of baptisms, marriages, and tax collection. New methods of inference include adjusting reports of explorers, who tended to report only the number of warriors. Such figures are now multiplied by factors to account for women, children, and elderly men. Archaeological data have been used to estimate amounts of foods, such as oysters, consumed as a basis for population estimates.

Can we ever know the real numbers? Historian William Borah is quoted as predicting that, with decades of careful research, scholars may eventually produce an estimate with a margin of error of 30-50%. (It might be interesting to know how he interprets a margin of error!)

"Keeping Score: Big Social Changes Revive the False God of Numbers"

by John M. Broder. The New York Times, 17 August 1997, 4-1.

It was announced last summer that there had been a drop of 1.4 million people in welfare rolls nationwide over the past year. This led President Clinton to state: "I think it's fair to say the debate is over. We know now that welfare reform works."

This article discusses why it is difficult to draw such conclusions on complex political issues from a single number. Broder points out that there are obviously many possible explanations for the drop in welfare rolls, some attributable to government policy and many others wholly unrelated. Welfare expert Wendell Primus remarked: "Those figures do not tell how many former recipients moved from welfare to work, or simply from dependency to despondency."

Bruce Levin, a statistician at Columbia University remarked: "This is the glory and the curse of the one-number summary. You take a hundred-dimensional problem like welfare reform and reduce it to a number."

Robert Reischauer of the Brookings Institute said, "We live in a society where political evaluations have to fit into a sound bite, so there is a tendency to focus on quantitative measures even when they may not be measuring the most important dimensions." He mentioned the following example. According to a 1996 study, the average one-way commuting time lengthened by 40 seconds between 1986 and 1996, to 22.4 minutes. The widely reported conclusion was that since more time was spent on the freeway the American quality of life was diminishing. This disregards the fact that many commuters voluntarily moved further from their jobs to bigger homes, greener lawns, and better schools.

Professor Levin remarked that the physical sciences can use controlled experiments, medicine uses longitudinal studies and clinical trials, but "when numbers are crunched in politics, axes are usually grinding, too."

"Homogenized Damages: Judge Suggests Statistical Norms to Determine Whether Pain and Suffering Awards are Excessive"

by Michael Higgins (September 1997). American Bar Association Journal, p. 22.

This article was suggested by Norton Starr who learned of it from George Cobb. It describes a case in the United States District Court for the Eastern district of New York with the decision made by Judge Jack B. Weinstein. The case number is 94-CV-1427, and it is easily available from Lexis-Nexus (U.S. District Lexis 14332). The whole decision makes interesting reading as an application of statistics to the law.

In suits beginning in March 1994, Patricia Geressy, Jill M. Jackson, and Jeannette Rotolo sued Digital Equipment Corporation claiming that Digital's computer keyboard caused repetitive stress injuries (RSI). A jury awarded them the following amounts.

Geressy:  Economic loss        $1,855,000
          Pain and suffering   $3,490,000
          Total                $5,345,000

Jackson:  Economic loss          $259,000
          Pain and suffering      $43,000
          Total                  $302,000

Rotolo:   Economic loss          $174,000
          Pain and suffering     $100,000
          Total                  $274,000

Digital appealed these awards and Judge Weinstein made the following rulings:
The Geressy case should be retried because new evidence had been uncovered since the trial suggesting that her medical problems were not the result of using the keyboard.

The Jackson award was thrown out on statute of limitations considerations.

The award in the Rotolo case was considered reasonable and allowed to stand.

Weinstein used this opportunity to expand on his opinions as to how a judge might go about assessing the reasonableness of a jury award under appeal. Noting that there are usually agreed upon ways to assess economic loss, he declines to consider the economic loss part of the award. On the other hand, there are not usually agreed upon ways to assess pain and suffering awards, about which he writes:
These awards rest on the legal fiction that money damages can compensate for a victim's injury... We accept this fiction, knowing that although money will neither ease the pain nor restore the victim's abilities, this device is as close as the law can come in its effort to right the wrong.
He quotes another court as saying:
The law does not permit a jury to abandon analysis for sympathy for a suffering plaintiff and treat an injury as though it were a winning lottery ticket. Rather, the award must be fair and reasonable, and the injury sustained and the amount awarded rationally related. This remains true even where intangible damages, such as those compensating a plaintiff for pain and suffering, cannot be determined with exactitude.
Weinstein explains why he feels that the present system of determining these awards is not rational. He refers to a study that found no correlation between the amount of the award and the length of time suffered. The author of the study was unable to find any rationale for jury awards for pain and suffering. Another researcher wrote:
Both anecdotal and empirical evidence indicates that the disparity between awards for pain and suffering among apparently similar cases defies rational explanation.
Weinstein proposes that, in considering the reasonableness of a jury verdict, the court should gather a group of similar cases as a comparison group. He comments:
The imprecision inherent in simply making a vague estimate by looking at a comparative group turns the court toward a statistical analysis.
Weinstein illustrates his proposed method by evaluating the amounts the jury awarded for pain and punishment for all three of the cases under consideration even though other circumstances made this necessary only for the Rotolo case.

He describes how he would find a comparison group of similar cases. He first gives careful consideration to what constitutes similar cases. He feels that the injuries in the various cases can have different causes, but should have similar symptoms. However, for these RSI cases, he would rule out a case where the injury was the result of a traumatic experience such as an airplane crash, even if the injury resulted in similar symptoms.

Weinstein found a group of 64 cases with symptoms similar to at least one of the three RSI cases. Twenty-seven of these cases were similar to the Geressy case, 16 to the Jackson case, and 21 to the Rotolo case.

Weinstein's method uses the mean and standard deviation for each of these three comparison groups. For the jury award to be considered reasonable, it should not be more than two standard deviations away from the mean of the comparison group. Any award exceeding this should be reduced to two standard deviations above the mean. Similarly, a jury award falling more than two standard deviations below the mean of the comparison group should be increased to that threshhold.

Weinstein judges the amounts awarded in the three RSI cases. For the 27 cases in the Geressy comparison group, the average award was $747,000, and the standard deviation was $606,873. The amount awarded for pain and suffering by the jury was $3,490,000. Two standard deviations above the mean of the comparison group is about $2,000,000, so Weinstein would reduce the award from $3,490,00 to $2,000,000.

The mean for the comparison group in the Jackson case was $147,925, and the standard deviation was $119,371. The jury awarded $43,000 for pain and suffering. This is about one standard deviation below the mean, and so Weinstein considers the jury award reasonable and would let it stand.

In the Rotolo case, the mean of the comparison group was $404,214, and the standard deviation was $465,489. The jury awarded $100,000 for pain and suffering. This is less than one standard deviation below the mean so Weinstein would let it stand.

In discussing his method, Weinstein remarks that the distributions of the comparison groups are not symmetric, so perhaps it would be more reasonable to use the median rather than the mean. As to his choice of two standard deviations, he writes:

Using two standard deviations supports the judiciary's efforts to sustain jury verdicts whenever possible. This approach is consistent with the federal and New York state constitutions that guarantee the right to trial by jury in civil cases. Narrowing the range to figures that fall within one standard deviation, however, speaks to the state policy of controlling jury verdicts.

"Parental Age Gap Skews Child Sex Ratio"

by J. T. Manning, R. H. Anderton, and M. Shutt. Nature, 25 September 1997, p. 344.

It is known that animals have some control over the sex ratio (ratio of males to females) of their offspring. Examples of this were reported in a recent issue of Nature (2 October 1997, p. 442), including a species of parrots that produces long runs of one sex. One female produced 20 sons in succession, followed by a run of 13 daughters. An earlier study on the Seychelles warbler showed large variation in the sex ratio. In this species of warbler, young females often remain on their parents' territory to help produce subsequent offspring, while young males move away. On high-quality territories, daughters are an advantage, and about 77% of the offspring are female. On low-quality territories, daughters are a disadvantage because of the resources they use up, and, as a result, only about 13% of the offspring are female.

There have been many attempts to show that similar variation can occur in humans. Studies of German, British, and US editions of Who's Who have found that men register far more sons than daughters. Despite the contribution of the Clintons, American presidents have produced 50% more sons than daughters. In addition, it has been observed that a higher proportion of boys are born during and shortly after wars.

Manning and his colleagues show that the sex ratio is correlated with the spousal age difference (age of husband - age of wife). Families in which the husband is significantly older than the wife tend to produce more boys, while those in which the wife is significantly older than the husband produce more girls. The authors based these conclusions on a study of 301 families whose children attended a secondary school in Liverpool that recruited students from a wide range of socio-economic groups.

The authors also found that, in England and Wales, the mean spousal age difference increased during and immediately after the two World Wars and was strongly correlated with the sex ratio during the period 1911 to 1952. Thus the increase in the proportion of boys during the two wars can be explained by the increase in the spousal age difference during these periods.

"The Hidden Truth about Liberals and Affirmative Action"

by Richard Morin. The Washington Post, 21 September 1997, C5.

Surveys generally show that Democrats favor affirmative action while Republicans are opposed to it. In their new book Reaching Beyond Race, published by Harvard University Press, Paul Snidermann and Edward Carmines say that such surveys do not reveal the true beliefs of Democrats.

To show this, they put the following twist on the idea of randomized response. They randomly divided a representative sample of the national population into two groups. One group was told: "I'm going to read you a list of three items that sometimes make people angry or upset. After I read you the list, just tell me how many upset you. I don't want to know which ones, just how many."

The interviewer then read a list of three items: the federal government increasing the tax on gasoline, professional athletes getting million-dollar-plus salaries, and large corporations polluting the environment.

The second group was presented with these same three items along with a fourth: Black leaders asking the government for affirmative action.

Since both groups got the same first three items, the idea is that any systematic difference between groups must be attributable to the response of the second group to the fourth item. According to the article:

When researchers analyzed the results, they found the political divisions over affirmative action found in other polls were conspicuously missing. Liberals were as angry as conservatives, [57 percent versus 50 percent] and Democrats [65 percent] were as angry as Republicans [64 percent].
Details of the estimation procedure are not provided. Interesting questions for discussion include how the estimation was done, and whether the framework of the survey may have influenced responses.

Return to Table of Contents | Return to the JSE Home Page