Journal of Statistics Education v.2, n.2 (1994)

Joan B. GarfieldGeneral College

University of Minnesota

140 Appleby Hall

128 Pleasant St. S.E.

Minneapolis, MN 55455

612-625-0337 jbg@vx.cis.umn.edu

J. Laurie Snell

Department of Mathematics and Computing

Dartmouth College

Hanover, NH 03755-1890

603-646-2951 jlsnell@dartmouth.edu

This column features "bits" of information sampled from a variety of sources that may be of interest to teachers of statistics. Joan abstracts information from the literature on teaching and learning statistics, while Laurie summarizes articles from the news and other media that may be used with students to provoke discussions or serve as a basis for classroom activities or student projects. We realize that due to limitations in the literature we have access to and time to review, we may overlook some potential articles for this column, and therefore encourage you to send us your reviews and suggestions for abstracts.

edited by Lionel Pereira-Mendoza (1993). International Statistical Institute, The Netherlands. (Available for $25 from the ISI office, e-mail: isi@cs.vu.nl)

In August, 1992, a Roundtable Conference sponsored by the International Statistical Institute was held in Canada on the topic of teaching data analysis in schools. Pereira-Mendoza chaired the conference and edited the collected papers and summaries of discussions. The resulting book is divided into six sections that focus on current practices of teaching statistics in different countries; how data analysis fits into the school curriculum; suggestions for how data analysis should be taught and who should be teaching it; examples of innovative curricula and instructional approaches that include data analysis; and considerations as to what research is needed to improve the teaching and learning of data analysis in schools.

Although this special journal issue does not focus on teaching or learning statistics, the articles should be of interest to statistics instructors who teach graduate students in education and the behavioral sciences. The purpose of the issue, as described by Guest Editor Bruce Thompson, is to explore the etiology and consequences of confusion over statistical significance testing. The articles present diverse perspectives on this topic and often take opposing positions. Articles include "The Case Against Statistical Significance Testing, Revisited"; "What Statistical Significance Testing Is, and What It Is Not"; "Historical Origins of Statistical Testing Practices: The Treatment of Fisher versus Neyman-Pearson Views in Textbooks"; "Confidence Intervals and the Scientific Method: A Case for Holm on the Range"; and "The Use of Statistical Significance Tests in Research: Bootstrap and Other Alternatives."

edited by David Green (1994). Sheffield: Teaching Statistics Trust. (Copies are available from the Centre for Statistical Education at the University of Sheffield, England, for $25 US for surface mail and $30 for airmail. E-mail orders can be sent to P.Holmes@sheffield.ac.uk)

Following the success of an earlier collection of articles from the international journal Teaching Statistics, this second collection represents the best articles from volumes 6-14. Edited by former editor David Green, the book is divided into seven sections, with the headings: Statistics in the Classroom, Students' Understanding, Teaching Particular Topics, Practical and Project Work, Using Computers, Statistics in Other Subjects and at Work, and Miscellany (which has papers on various topics including "The Statistics of Safe Travel" and "Fishy Statistics"). This is an important resource for those who do not have a complete collection of journal issues, and it is a nice companion volume to the earlier collection of articles assembled by Peter Holmes.

by Anne Hawkins, Flavia Jolliffe, and Leslie Glickman (1992). London: Longman Group.

This is a very ambitious and valuable book that tackles the entire range of issues relating to teaching statistics at different educational levels. Chapters focus on the past and present of the discipline and curriculum; the difficulties, opinions, and needs of statistics teachers; the teaching of descriptive statistics, probability, and inference; research in statistical education; practicals and projects; statistical computing; multimedia for teaching statistics; and assessing statistical knowledge. An extensive 14-page bibliography provides an additional resource.

by Gregory Samsa and Eugene Z. Oddone (1994). The American Statistician, 48(2), 117-119.

The recommendation to include writing across all courses in the curriculum has led to several attempts to challenge students to do more writing in college statistics courses, both to help them develop more effective communication skills as well as to improve their learning. This article focuses on a biometry course where students received practice in writing papers and in critically evaluating medical papers. The course was based on a model of expository writing whose goal is to clarify and make explicit components of statistical reasoning. Details of the course are offered as well as a discussion on how writing could be better integrated into the statistics curriculum.

by Bernard C. Beins (1993). Teaching of Psychology 20(3), 161-164.

Another article on the use of writing assignments in a statistics course presents details on an experimental study involving four statistics classes for psychology students. Each class received a different amount of guidance and instruction in using interpretative skills. In one of the courses, students were asked to write press releases that did not use any statistical terminology. These students appeared to acquire better computational and interpretive skills than did students in a traditional version of the course. In addition, the use of writing assignments seemed to help focus students' attention on the context and rationale for the statistics they were taught.

by Clifford Konold (1994). The Mathematics Teacher 87(4), 232-235.

There have been several articles in the news over the past few years regarding China's attempt to reduce its population growth. This paper describes an activity based on modeling the effects of limiting Chinese families to having one son (i.e., families could only have children until they had their first son, and then were not allowed to have additional children). The activity described begins with a class discussion of a newspaper article on this topic, followed by discussions where students try to estimate the average number of children in a family under this policy as well as the ratio of girls born to boys born. "The Prob Sim" software is described, and its use in simulating data to solve these problems is illustrated.

by Raymond N. Greenwell (1993). PRIMUS 3(4), 345-354.

Perceiving a gap between the statistics taught in college courses and the statistics actually used in industry, this author describes the discrepancies he sees and introduces a few "hot topics" in industrial statistics that could be included in courses to help overcome this problem. The Taguchi Methods are described, and an example is given for use in a statistics course for engineers.

by Gary Kader and Mike Perry (1994). Mathematics Teaching in the Middle School 1(2), 130-136.

Although the focus of this article is on using computer software with middle school students, the information is relevant for many college-level introductory courses. First, a model of the statistical problem-solving process is offered, and technology is related to this model by viewing software as a way to supply tools necessary for the analysis stage of this process. The authors demonstrate and compare three commercially-available statistical software products in the context of a class investigation.

by Patricia B. Cerrito (1994). Mathematics and Computer Education, 28, 141-153.

This paper describes the complete revision of one introductory statistics course to incorporate the use of computer technology, the inclusion of relevant statistical problems, student writing activities, student learning groups, and a change in focus to the learning of important concepts. Details are offered regarding the structure of the revised curriculum, topics for student projects, evaluation criteria for students' written reports, and components of a final exam.

by Carl J. Huberty, Janna Dresden, and Byung-Gee Bak (1993). Educational and Psychological Measurement 53, 523-532.

Citing the controversy between conceptually-oriented versus calculation-centered statistics classes, a bid for helping students develop conceptual understanding is given. A study is described that compared student performance on test items representing different levels of understanding. Based on the analysis of test data, three domains of statistical knowledge were proposed: calculations, propositions, and conceptual understanding. The authors suggest that instructors should test students in more than one of these domains and should challenge students to move beyond the level of computational understanding.

by the American Psychological Association (1988). Washington, DC: Author.

This handbook contains activities for use in introductory statistics or research methods courses. One article, "Research Methodology Taxonomy and Interpreting Research Studies," by Peter Bohling, G. Alfred Forsyth, and Richard May (pp. 38-43), presents an activity designed to help students decide when a particular type of statistical analysis should be used and what conclusions are appropriate for a research study.

A regular component of the Teaching Bits Department is a list of articles from Teaching Statistics, an international journal based in England. Brief summaries written by the authors of the articles are included. In addition to these articles, Teaching Statistics features several regular departments that may be of interest, including Computing Corner, Curriculum Matters, Data Bank, Historical Perspective, Practical Activities, Problem Page, Project Parade, Research Report, Book Reviews, and News and Notes.

The Circulation Manager of Teaching Statistics is Peter Holmes, p.holmes@sheffield.ac.uk, Center for Statistical Education, University of Sheffield, Sheffield S3 7RH, UK.

**"Teaching by Design"** by Adrian Bowman

Summary: Some teaching material which invited a group of students to design and carry out a small experiment to investigate the working of short term memory is described. No equipment of any kind is required. These ideas have been found to be very helpful in making students think about some very basic and practical issues in designing and carrying out an experiment. The simple nature of the material makes it accessible to all levels of students.

**"More Computer Generated Thinking"** by Gerd Riehl

Summary: The graphic presentation of random processes by Markov chains allows an easy access to both recursive formulae for the distribution and the proof of suppositions about the probabilities appearing within these distributions.

**"Common Elements Correlation"** by Robert M. Lynch

Summary: This work demonstrates common elements correlation, its extension to negative correlation, and the production of bivariate normal samples for a specified correlation matrix.

**"Learning about Extremes"** by Stuart G. Coles

Summary: For many physical processes it is extreme levels which are of greatest concern. This article gives a practical introduction to the problems and models involved.

**"Teaching Independence"** by Henrik Dahl

Summary: Examples are given which illustrate how independence enters into statistical problems and a demonstration of independence is presented. Coverage in statistical textbooks is examined.

**"Odds that don't add up!" **by Mike Fletcher

Summary: This article presents examples of situations where a lack of understanding of probability leads to erroneous expectations.

**"Teaching Statistics through Resampling"** by Chris Ricketts
and John Berry

Summary: This paper describes experiences of teaching statistics without mathematical theory but using computer-intensive re-sampling methods. The method is relevant to statistics teaching at all levels.

**"Model Comparison in Regression"** by Hirokuni Tamura

Summary: The instructor can help beginning students master the numerous results of regression analysis by using helpful notation and by providing a framework for organising regression outputs. For the latter, model comparison is a useful operating notion.

**"Estimating the Size of a Population"** by Roger W. Johnson

Summary: Several estimates of an unknown population size are compared.

**"Sampling Errors in Political Polls"** by Zbigniew Kmietowicz

Summary: This article examines the sampling error of the lead of one political party over another as observed in a random sample of voters. The sample size needed to achieve a certain precision is also investigated.

**"Tampering with a Stable Process"** by Timothy C. Krehbiel

Summary: This article presents a variation of the funnel experiment made famous by W. Edwards Deming. Ideally suited for classroom use, this exercise illustrates the disastrous consequences resulting from tampering with a stable process.

**"Cooperative Learning in Statistics"** by Carolyn M. Keeler
and R. Kirk Steinhorst

Summary: The formal use of cooperative learning techniques developed originally in primary and secondary education proved effective in improving student performance and retention in a college freshman level statistics course. Lectures interspersed with group activities proved effective in increasing conceptual understanding and overall class performance.

**
"Teaching Probability"** by Rod Bramald

Summary: This article examines some of the difficulties associated with teaching probability. It is argued that a key difficulty is the lack of transferability of pupils' curriculum based knowledge.

by Eric S. Lander and Bruce Budowle. Nature, 27 October 1994, 735-738.

These authors want to end the controversy over the use of DNA fingerprinting in the courts. Eric Lander has been a critic of the lack of scientific standards for DNA fingerprinting. Bruce Budowle, chief scientist for the FBI, has been a staunch defender of its use. In this article they say that they now agree on the validity of its use and state how they feel it should be used.

Most of the recent controversy about the use of DNA fingerprinting has centered around the use of the "product rule" to calculate the probability that a match would occur in a randomly chosen person from a certain population. This product rule assumes that the individual alleles at different loci can be treated as statistically independent.

If a population is made up of subpopulations with different gene frequencies, then independence cannot be assumed. This issue was discussed in a 1992 report of the National Research Council. The report recommended using a "ceiling principle" described as follows: The population should be broken down into 10 to 15 subpopulations which can reasonably be assumed to be homogeneous. Then for each site the allele frequency should be taken to be the maximum over the subpopulations. These maximum values for the genotypes at the different loci should be multiplied to give "worst case" probabilities.

The authors discuss in detail six objections that have been raised about this principle and explain why they feel these objections are not valid. They feel that the very conservative estimates given by the ceiling principle with four or five sites will typically give odds of several million to one, which should be convincing enough. Additional loci could be added to increase the odds if necessary.

They suggest that the independence computation could also be given in a trial, but with the understanding that the truth probably lies somewhere between that result and the estimate given by the ceiling principle.

by Charles C. Mann. Washington Post, Book World, 30 October 1994, X1.

This is a review of the following two books describing the National Health and Social Life Survey carried out by the National Opinion Research Center.

Sex in America: A Definitive Survey by Robert T. Michael, John H. Gagnon, Edward O. Laumann, and Gina Kolata (1994). Boston: Little, Brown.

The Social Organization of Sexuality: Sexual Practices in the United States by Edward O. Laumann, John H. Gagnon, Robert T. Michael, and Stuart Michaels (1994). Chicago: University of Chicago.

The first book is a carefully written account of the survey with a minimum of technical statistics. It is obviously aimed at the best-seller list.

The second book is a more technical report of the survey, but it is also written for the non-expert. Indeed, it is, in many ways, a fine primer on survey sampling. The authors discuss in detail how they chose the nature and size of the sample, and how they decided between a telephone survey, interviews, and written forms. They also discuss possible biases and how they planned to check for them. They often take time along the way to explain in quite simple terms the statistical techniques they have used.

The reviewer does a good job of highlighting some of the inevitable problems in a study of this kind. Because of the withdrawal of federal support, the researchers had to decrease the size of the sample from the proposed 20,000 to 9,000. They had to eliminate over 1,000 people because their addresses were found to be empty buildings. For financial reasons they had to limit the study to English-speaking adults between 18 and 59 years old, which eliminated another 3,500 people. The researchers interviewed 3,432 of the 4,369 people who remained, giving a response rate of 78.6 percent.

Once more for financial reasons, they chose not to include people in institutions such as college dormitories, military barracks or prisons. These omissions might cause problems with using the survey for establishing AIDS policies, which was one of the researchers' main objectives.

The reviewer comments on the obvious difficulties of telling whether people are being truthful in a survey of this kind. The researchers asked for written responses from some of those interviewed and compared them to the interview responses, but the reviewer was not convinced that this was much of a check.

In discussing possible biases in their survey in the the more technical book, the authors state: "Only six percent of the interviews took place with the spouse or other type of sex partner present, and an additional 15 percent had other people present ... These 'others' were overwhelmingly likely to be children or stepchildren of the respondent... When interviewed alone, 17 percent of the residents reported having two or more sex partners in the past year, while only five percent said so when their partners were present during the interviews."

After discussing how they examined this problem, the authors remark, "On the basis of these bivariate analyses, we cannot conclude that the presence of others caused the reporting differences in the sense of suppressing the truth."

The reviewer concludes that "The National Health and Social Survey represents a mountain of hard work, but its findings should be greeted with more skepticism than its authors would like."

by Christopher Winship. The New York Times, 15 November 1994, A29.

Richard Herrnstein's and Charles Murray's book The Bell Curve (1994, New York: Free Press) has been the center of a great deal of controversy. This controversy has focused on the questions raised in the book--whether I.Q. is hereditary and whether racial differences in I.Q. are predominantly due to environmental or genetic factors.

While admitting that many of the criticisms of the book are justified, this author points out that the controversies have taken the attention away from some of the major themes of the book that he feels are worthy of serious attention. He mentions three such themes:

(a) As a society we are becoming increasingly socially and economically stratified by the level of cognitive ability,

(b) Cognitive ability is a strong predictor of various social problems, and

(c) It is very hard to change cognitive ability.

Winship expresses the hope that neither irresponsible statements in the book nor the media's vitriolic response to the book will stand in the way of bringing more sophisticated research to bear on the issues discussed in this book.

Another source of interesting commentary on The Bell Curve can be found in the Letters to the Editor section of "The New York Times Book Review," 13 November 1994. These letters are in response to a review by Malcolm W. Browne of The Bell Curve and two other related books in the October 16 Book Review section of The New York Times.

by The Associated Press. The Boston Globe, 27 October 1994, A9.

A study in the Journal of the National Cancer Institute reports that the annual risk of breast cancer for 40-year-old women is 0.6 per 1000 among women who have had an abortion, and 0.4 per 1000 for women who have not. (Other reports on this article state that those who had an abortion had a fifty percent increase in the chance of developing breast cancer by age 40.)

A higher risk of breast cancer was associated with abortions performed when women were younger than 18 or older than 30. The risk was not found to be associated with number of abortions or number of live births or miscarriages. These findings were based on interviews with 845 breast cancer patients and 961 healthy women of the same age group.

One commentator suggested that there could be a bias in the answers obtained by interviews. Women with cancer might be more likely to admit having had an abortion than those in the control group.

by Dana Milbank. Wall Street Journal, 3 October 1994, B5.

Europe has recently had a higher unemployment rate than the United States and Japan. A theory used to explain this says that European wages are more rigid than those in the United State and Japan, in the sense that labor costs remain high even when unemployment rises. This in turn is thought to be caused by the high degree of unionization in Europe and high unemployment benefits. It has led to a style of labor politics (sometimes called the Reagan-Thatcher style) that recommends union busting and reducing unemployment benefits to combat unemployment.

This theory has been challenged in a study carried out by two economists, David Blanchflower of Dartmouth College and Andrew Oswald of the London School of Economics. Using government wage and employment data for millions of workers in 15 countries, they showed that wage flexibility showed little variation across the globe. In each country, they found that doubling the local unemployment rate (within a region or industry) is associated with a drop in pay of roughly 10 percent.

The study conflicts with the conclusions of many other groups, including major banking interests. Those conclusions presumably were based on theories rather than on studies.

by Arnold Barnett. Technology Review, October 1994, 39-45.

Barnett proposes six deadly sins of statistical interpretation. We give one example of each. The article provides more discussion of these and other examples.

- Generalizing from Non-Random Samples

Researchers at the Harvard Medical School found, in interviews with 1,500 people who had suffered heart attacks in the previous few days, that a disproportionate number reported episodes of extreme anger in the two hours preceding the attack. They were led to an estimate that anger was associated with 2.3 times the usual heart attack risk. The Boston Globe generalized this to all of us by simply reporting that anger "can double the chance for heart attack." - Look! a Trend

In 1993, the International Airline Passenger Association began rating airlines for safety. An airline having the fewest deaths over a five-year period might be considered a particularly safe airline. However, the data show that the "safest" airline in one period is apt to be the least safe in another period, suggesting that an observed trend may be due to normal chance fluctuations having nothing to do with real differences among airlines. - Unjust Law of "Averages"

In 1987, the Department of Transportation required U.S. airlines to report each month the percentage of their flights into the nation's 30 busiest airports that arrived on time. This information has been used in advertisements, such as Northwest's boast that it is "the number one on-time airline." If an airline has a large percentage of its flights into a city with bad weather, such as Seattle, it is at a disadvantage compared to an airline that has a large percentage of its flights into a city with good weather, such as Phoenix. - Verbal Imprecision

A statistical study reported that the odds of a death sentence in a white-victim case were 4.3 times the odds in a black-victim case in Georgia. The New York Times reported this as "4.3 times as likely," and the Supreme Court used this interpretation. Using this incorrect interpretation, a probability of .99 of a death sentence when the victim is white implies a chance of .23 when the victim is black. With the correct interpretation in terms of odds, the .23 becomes .96. - The Unsound Comparison

In early 1992, The New York Times reported that a record number of killings occurred in 1991 in four of the nation's ten largest cities: Los Angeles, San Diego, Dallas, and Phoenix. They failed to point out that all four of these cities also reached new highs in population in 1991. - The Hidden Defect

An article in the journal Risk Analysis in 1991 reported that a U.S. driver--age 40, sober, wearing a seat belt, and driving a heavier-than-average car--enjoys "slightly less" mortality risk on a 600-mile trip than a person who takes the same trip by air.

The analysis began with the overall death rate per mile driven on rural interstate highways. This was multiplied by risk factors for age, wearing a seat belt, and driving a heavier car. Multiplying these risk factors led to a much smaller final risk than was justified, since the factors are not independent.

by Gina Kolata. The New York Times, 11 September 1994, Section 4, p. 4.

The recent crash of a USAir plane near Pittsburgh has prompted many questions about the safety of air travel and the relative risks between airlines. According to the article, USAir's accident record does look grim. Among major commercial carriers, the last three fatal crashes in the United states were USAir planes; the airline has been involved in four out of the last seven major air disasters; and USAir has had five fatal crashes in the past five years. Ms. Kolata asks: "Would it be a rational decision to avoid flying USAir in favor of its competitors? Or, considering the vast number of passengers carried by airlines, can USAir's tragic losing streak be attributed to the vagaries of chance?"

The question of safety seems to arouse little controversy. By all accounts, air travel, as far as major accidents are concerned, carries an extremely low risk. Dr. Arnold Barnett of M.I.T. makes this very clear: "Roughly speaking, if you were to board a jet flight at random every day, it would take 26,000 years on average before you succumb to a major crash."

But what about the relative risks between airlines? Here the answer is not so straightforward. Using safety records, Dr. Barnett ranked eight major airlines over several ten-year periods and found that not only was the first-ranked airline different each time, but this same airline finished in the bottom half of other rankings.

As for USAir, Barnett says that while there is a two to 10 percent chance that the airline's crash record is due to chance alone, at the same time, if you were to board a USAir jet at random in the 1990's, your chances of being killed would be nine times higher than on any other airline.

by Jim Albert (1994). Journal of the American Statistical Association, 89, 1066-1074.

A recent book by Cramer and Dewan (STATS 1993 Player Profiles, published in 1992 by STATS Inc.) provides data on aspects of a player's performance, such as his batting average, in different "situations." This article uses these data to look for situations that significantly affect a player's batting average.

The author starts with Wade Boggs' performance in 1992, to see how he performed against left-handed and right-handed pitchers, pitchers that induce mainly groundballs as compared to flyball pitchers, night games as compared to day games, grass as compared to artificial turf, and home games as compared to away games. Some factors seem to have an effect and others do not. This gives a chance to illustrate that if you look at enough features of a data set, just by chance one or more will seem unusual. The author then looks at a whole group of players, asking the same kind of questions, and finally looks to see if differences established here carry over to different seasons.

Albert found that "variation in batting averages by the pitch count is dramatic--batters generally hit 123 points higher when ahead in the count than with 2 strikes" (p. 1071). Smaller but significant differences appear when facing a pitcher of opposite arm, facing a groundball pitcher rather than a flyball pitcher, and playing at home. These differences do seem to carry over to different seasons.

(A Hat Check Problem) by Marilyn Vos Savant. Parade Magazine, 21 August 1994, p. 8.

Charles Price is baffled by the following problem: Take an ordinary deck of 52 cards and shuffle it. Then turn the cards over one at a time, counting as you go: ace, two, three, and so on, until you reach king; then start over again. The object is to turn over all 52 cards without having your spoken number match in rank the card that you turn over.

Price mentions that he has tried it hundreds of times and only once turned over all the cards with no match. He expected it to happen more often. Obviously, he wants to know the chance of getting through the deck without a rank match. He has to settle for Marilyn's telling him that, because the expected number of matches is four, he should not expect to succeed very often.

The origin of matching problems like this one and the related "hat check problem" can be found in a 1708 book by Montmort (Essay D'analyse Sur les Jeux De Hazard, reprinted in 1980 by Chelsea Publications). Montmort explained some of the common games of the time that involved probability--in particular, the game of Treise, which is played as follows:

One player is chosen as the banker, and the others are players. Each player puts up a stake. The banker shuffles the cards and starts dealing, calling out the cards in order: ace, two, three,..., king. The game continues until there is a rank coincidence or until the banker has dealt thirteen cards without a coincidence. If there is no match, the banker pays the players an amount equal to their stakes, and a new dealer is chosen. If there is a match, the banker wins from the players an amount equal to their stakes and starts a new round, counting again ace, two, three, etc. If he runs out of cards, he reshuffles and continues the count where he left off.

Montmort remarked that the dealer has a very favorable game and could easily get several matches before losing the deal. He despaired of finding the actual advantage but solved some related problems. He first simplified the game by assuming that the deck of cards had only 13 cards of one suit. He then found that the probability of getting through the 13 cards without a match is about 1/e = .368, providing the first solution to what is now called the "hat check" problem. Later, with the help of John Bernoulli, he showed that in drawing 13 cards from a 52-card deck the chance of not getting a rank match was .357, making it clear that the dealer has a considerable advantage.

In the problem that Price suggested, the player goes through the entire deck of 52 cards; this makes the problem harder because you can have different match patterns. Your matches might involve distinct ranks or the same ranks or both.

Further history of this problem as well as a more complete solution to the problem raised by Price can be found in an article "Frustration Solitaire" that can be read by Mosaic at the following address: http://www.geom.umn.edu/people/doyle.htmlReturn to Table of Contents | Return to the JSE Home Page