Journal of Statistics Education v.2, n.1 (1994)

Joan B. GarfieldGeneral College

University of Minnesota

140 Appleby Hall

128 Pleasant St. S.E.

Minneapolis, MN 55455

612-625-0337 jbg@vx.cis.umn.edu

J. Laurie Snell

Department of Mathematics and Computing

Dartmouth College

Hanover, NH 03755-1890

603-646-2951 jlsnell@dartmouth.edu

This column features "bits" of information sampled from a variety of sources that may be of interest to teachers of statistics. Joan abstracts information from the literature on teaching and learning statistics, while Laurie summarizes articles from the news and other media that may be used with students to provoke discussions or serve as a basis for classroom activities or student projects. We realize that due to limitations in the literature we have access to and time to review, we may overlook some potential articles for this column, and therefore encourage you to send us your reviews and suggestions for abstracts.

Many statistics educators are familiar with the "Against All Odds" video series that David Moore helped develop. This article examines the use of videos (which imply passive learning by students as they watch a television screen) in light of current recommendations for educational reform which encourage teachers to actively engage students in learning statistics. A review of research on learning through television is presented and is used as the basis for recommendations on the appropriate use of videos in statistics classes. Some examples of instructional use are offered, such as using a video to introduce and motivate a topic, to begin a discussion, or to better explain a difficult topic.

Although this article was written to encourage high school mathematics teachers to try a regression activity with graphing calculators, the information on the activity is also appropriate for use in a college classroom. The activity described involves the analysis of men's and women's world records in the 800-meter run, to determine if women will soon be able to "outrun men." Different regression models are suggested for analyzing the data and for comparing results of analyses. The author concludes that the answer to the question of whether women will soon be able to outrun men is "possibly."

This paper describes a psychological research study on students' understanding of some probability problems involving coin tosses. Although students in this study were able to determine that four different sequences of coin tosses were all equally likely to occur, they selected some of the same sequences as "least likely" to occur, indicating that they did not really understand the idea of independence. On the basis of interviews with 20 students revealing their reasoning about these problems, some tentative conclusions were drawn as to the reason for such inconsistencies in responses. The results also indicate that standardized test items that ask students the first type of question and not the second may lead to the erroneous conclusion that students understand the idea of independence.

Although this might seem like a text or workbook, it is actually a unique set of problems in probability and statistics designed to help students develop an understanding of basic concepts and principles. These problems challenge students to apply their skills and conceptual understanding in different contexts or problem settings, and are not at all routine "plug numbers into a formula to generate an answer" problems. This collection includes problems requiring different levels of knowledge, spanning material from high school through graduate courses. They are arranged by topic, in increasing order of difficulty. Many answers are provided along with an extensive reference list.

This book should be read by everyone teaching a statistics course. It contains chapters by leading statistics educators on every possible topic related to teaching statistics. The book is divided into four parts. Part 1 introduces broad issues in teaching statistics, such as what the content should be and how statistics should be treated as a discipline separate from mathematics. Part 2 focuses on "Innovative Curricula" and includes chapters describing innovative approaches to teaching statistics. This section also includes a comprehensive article on the "Psychology of Learning Probability." The third part of the book examines the role of technology in teaching statistics, and the fourth part offers resources for statistics educators, including bibliographies of materials, case studies, real-world data sets, and electronic networks.

This article describes one woman's experience designing and teaching an alternative statistics course, to ensure that math-anxious women would find the subject "approachable, useful, and even fun." The author describes the components that made this statistics course a feminist course: first, the position of the professor as an authority figure was deemphasized, and, second, there was an increased sense of cooperative learning and reduced competitiveness. In addition, many strategies were used to decrease anxiety, such as the introduction of relaxation techniques to stop math-panic attacks, the elimination of timed tests, and the use of an absolute scale for grades rather than a curve.

The next three articles deal with resampling techniques in teaching statistics.

For anyone who has not yet heard about the "resampling method," this is a good introductory article on the method, how it works, and how it can be used to teach students to solve probability problems. The general procedure involves constructing a simulated universe using some randomizing mechanism, specifying rules for drawing a sample of data, generating trials of resampling data, and using these data to calculate a probability. The authors describe how this method was developed to provide students with a tool to use in solving probability problems, and give examples of how random sampling may be used to solve different problems. These authors are so enthusiastic about the use of resampling methods that they are willing to stake $5000 on a contest between students taught using these methods and students who have learned conventional methods of solving probability problems.

These authors find the resampling approach both "delightful and dangerous," based on their experiences with the program developed by Simon and Bruce. They discuss their use of the random sampling program and counter some of Simon and Bruce's claims about the benefits of teaching using the resampling program.

This short paper is a rebuttal to the Boomsma and Molenaar article. The authors acknowledge that resampling is not a perfect tool and has some limitations, but they continue to provide evidence that they believe indicates that resampling methods are the best way to teach students how to solve basic probability problems.

Weissglass and Cummings describe another simulation and sampling program, called "Hands-on Statistics." Examples are given of how this program may be used to solve a variety of probability and statistics problems. The program was developed to offer students opportunities to experience "dynamic visual experimentation," by simulating experiments, controlling and altering parameters, taking repeated samples, and visually comparing results of different experiments. The authors describe the pedagogical and curricular implications of using this program, and provide results of a small evaluation using a controlled experiment.

A regular component of the Teaching Bits Department will be the Table of Contents from Teaching Statistics, an international journal based in England. Brief summaries written by the authors of the articles will also be included.

In addition to the articles listed below, Teaching Statistics includes several regular departments that may be of interest. These include Data Bank, Computing Corner, Book Reviews, Problem Page, Research Report, and Curriculum Matters. The journal also includes a copy of the Newsletter of the International Association for Statistical Education.

The Circulation Manager of Teaching Statistics isPeter Holmes

Center for Statistical Education

University of Sheffield

Sheffield S3 7RH, UK. p.holmes@sheffield.ac.uk

**"Testing Colour Proportions of M&M;'s"** by Roger W. Johnson

Summary: We test the claimed colour proportions of plain M&M;'s candies using Pearson's chi-squared statistics. The null distribution of this statistic is examined through a Minitab simulation.

**"Coke or Pepsi?"** by Maita Levine and Raymond H. Rolwing

Summary: A binomial experiment dealing with a volunteer's ability to distinguish Coca-Cola from Pepsi-Cola is used to introduce the concepts of hypothesis testing and Type I and Type II errors.

**"The Game of Luk Kow"** by Ann-Lee Wang

Summary: Luk Kow is a Chinese game played with three dice. Students may find it interesting to simulate the game and to study its properties.

**"Observations on the Definition of P-values"** by John E.
Freund and Benjamin M. Perles

Summary: Many books define P-values in different ways, and it is generally assumed that these definitions are all equivalent. That this is not the case may come as a surprise, but we shall demonstrate here that some definitions of P-values are not even consistent with such rudimentary a procedure as specifying a critical region.

**"An Easy Ridiculous Unbiased Estimator"** by Steven MacEachern
and Elizabeth A. Stasny

Summary: The principle of unbiased estimation is very attractive to students of introductory statistics courses. Unfortunately, there exist problems in which all unbiased estimators will sometimes provide ridiculous estimates of the parameter of interest. In this paper we present an excellent example of this shortcoming.

**"Assumptions are Important: The Paired and Pooled t Test"**
by J.C.W. Rayner

Summary: The relationship between the test statistics for the paired and pooled t tests, when applied to the same data set, is obtained. This assists us to understand why each is preferred over the other in appropriate circumstances.

**"The Analysis of Experimental Data: The Appreciation of Tea and
Wine"** by Dennis V. Lindley

Summary: A classical experiment in the tasting of tea is used to show that many standard methods of analysis of the resulting data are unsatisfactory. A similar experiment with wine is used to show how a more sensible method may be developed.

**"Multiple Criteria Decision Making Applied to Horse Racing"**
by David Windle

Summary: This article describes how horse racing data may be used to introduce the basic ideas of decision theory.

**"A Measure of Relative Dispersion"** by Joseph Eisenhauer

Summary: This paper suggests a measure of dispersion which is invariant with respect to linear transformations and distinguishes between large and small spreads.

**"Computers in the Statistics Curriculum"** by John Higgo

Summary: This article is a condensed extract from the comprehensive report Computers in the Mathematics Curriculum (1992) recently published by the Mathematical Association.

**"Variability - Does the Standard Deviation Always Measure It
Adequately?"** by Louis A. Pingel

Summary: This article discusses an example where four sets of scores with different patterns of variability have the same standard deviation. It is then shown that different measures of variability lead to different conclusions about which set of scores is most variable.

**"A Legend for Teaching Normal Probability Plotting"** by
Warren Gilchrist

Summary: The concept of the Legend as a teaching device is introduced and illustrated by a Legend devised to aid in the teaching of Normal Plotting. Readers are challenged to devise and submit further legends.

**"Are They Getting There?"** by Michael Rycraft

Summary: A familiar situation, lateness of trains, is considered. The problems of quantifying it are considered together with situations where the binomial distribution, confidence intervals, means, medians etc. can be used.

**"A Graphical Display for Comparing Bowlers in Cricket"** by
Alan Kimber

Summary: In cricket it is of interest to compare bowlers on the basis of bowling average, economy rate and strike rate. In this article is presented a simple graphical display for making simultaneous comparisons on the basis of these three summary measures. The method is illustrated with some examples from Test Match and First-Class cricket.

Chance magazine typically provides a pair of articles that present different views on an issue of current interest. The Winter 1993 issue featured two such articles on the risk of second-hand smoke.

In 1986, the National Research Council and the U.S. Surgeon General released reports suggesting an increased risk in lung cancer in nonsmokers due to environmental tobacco smoke (ETS). The data used to make this claim came from 33 epidemiological studies estimating the risk of lung cancer for a person who does not smoke, but lives with a person who does.

Gross reports that his meta analysis of these 33 studies leads to a relative risk of 1.13 with a 95% confidence interval of 1.00 to 1.28, a risk which he observes is not significant. Gross also discusses the problem of biases in the determination of the cause of death and the misclassification of smoking status.

Rockette, however, argues that the case should be considered from the point of view of conditions required to establish causality. He claims that biological plausibility has been established because it has been shown that passive smoking subjects a person to the same carcinogens as active smoking. Animal studies have also shown an association between ETS and cancer.

He suggests that the role of the epidemiological studies and meta analyses is to estimate risk from exposure to low doses of the carcinogen. For this, he bases his discussion on the 30 studies of the same type included in the 1992 report of the Environmental Protection Agency. He observes that 27 out of the 30 studies estimated risks greater than one. He finds these studies as convincing as Gross found the 33 used by the 1986 reports unconvincing.

Rockette admits that these uncontrolled studies are subject to the standard problems of confounders, bias, etc. Still, as a public policy question, he argues that when a carcinogen has been established, it is reasonable to attempt to control it even when it is not feasible to gather adequate statistical power to detect low risks.

A group of flight attendants and heirs of attendants who have died have brought a class action suit in Florida against the tobacco companies contending that exposure to cigarette smoke caused the attendants' illnesses and deaths.

The lawyer for the group took sworn testimony from some executives of some tobacco companies. The Times article gives exchanges from this testimony. Here is one sample, but students might be encouraged to read all their answers. They are truly amazing!

This sample is from the interview with Andrew H. Tisch, Chairman and Chief Executive of Lorillard Tobacco Company.

Q: Does cigarette smoking cause cancer?

Mr. Tisch: I don't believe so.

Q: Based on what?

A: Based on my understanding of the scientific and statistical evidence that's been presented.

Q: What is your understanding of the scientific and statistical evidence that's been presented?

A: There's been no conclusive scientific evidence that's been presented that convinces me that cigarette smoking causes cancer.

Q: So since you, as the chairman and chief executive of Lorillard, believe that it hasn't been proven that smoking causes lung cancer, heart disease and emphysema, why do you have that on your packages?

A: Because this is what is required of us as a matter of law.

Q: You have to do it?

A: That is correct.

Q: If you didn't have to do it, you wouldn't do it?

A: Not necessarily. . . .

Q: As far as you're concerned, Mr. Tisch, as the chairman and chief executive officer of Lorillard, this warning on the package which says that smoking causes lung cancer, heart disease and emphysema is inaccurate? You don't believe it's true?

A: That's correct.

Q: Because if you believed it were true, in good conscience you wouldn't sell this to Americans, would you, or foreigners for that matter?

A: That's correct.

Another executive was not so generous, stating that he did not really care whether cigarettes caused lung cancer as long as he was selling a legal product.

W. Edwards Deming died on December 20, 1993, at age 93. Two of the more complete commentaries on his work were:

The New York Times article tells the familiar story of Deming's initial successes with quality control, first in Japan and then in the United States with Ford and Xerox. It mentions his general theory of management and his vigorous life; he ran workshops right up to his death.

The article in the Financial Times ties in Deming's work with others, including his collaboration with the founder of quality control, Walter Shewhart, and Deming's co-worker in Japan, Joseph Duran. (Duran was himself on his final tour at age 89.)

While the author does not limit himself to quality control, this article and the comments on it give a good "non hype" view of the work Deming and others did to bring quality control to industry.

Some of Banks' statements are controversial:

That statistical methodology is responsible for the Japanese success story is a myth fostered by the 1980 NBC television special on the Deming story "If Japan Can, Why Can't We?"

Banks feels that the ideas behind Total Quality Management (TQM) are useful, simple, and not deep. He remarks that "TQM has worked very successfully in diverse industries, and the nation's economic engine would doubtless run more smoothly if it were more widely employed." He goes on to say, however: "There are dangers in TQM. As implemented, it tends to be enshrined, and this stifles creative solutions." This assessment of TQM guarantees a lively response by the discussants.

A more positive view of TQM and of Deming's work is presented by Robert Hogg, one of the discussants.

The article, and the discussions that follow it, provide a down-to-earth and insightful analysis of Total Quality Management, in particular, and industrial statistics in general.

There are currently a lot of articles on whether to recommend mammograms for women under 50. The next two by Gina Kolata provide good summaries of the statistical, ethical, political, and economic issues involved in this decision.

Eight studies in the past 30 years have consistently found that routine mammograms for women over 50 reduce the death rate from breast cancer by at least 25%. These studies have not found that routine mammograms significantly reduce the risk of breast cancer for younger women. The eight studies include a total of 173,000 women in their 40's. As a group, the studies do not show a statistically significant advantage for those younger women who had mammograms.

Based on these studies, the National Cancer Institute now recommends that women under 50 be informed of the results of present scientific studies and be encouraged to make their own choice. (The NCI previously recommended that women under 50 have mammograms.) Other agencies continue to recommend mammograms for women under 50. The first article provides expert opinions on both sides of this issue.

Those who argue for testing remark that "it is better to be safe than sorry" and that women who do not have the test and later develop breast cancer will blame themselves. Proponents of testing also suggest that the studies are not very convincing. They claim, for example, that the patients were not followed for a sufficient length of time and that mammogram techniques have significantly improved since these studies.

On the contrary are the arguments that the large number of false positives leads to unnecessary fears, additional testing and even surgery, and that the expense for a small number of successes is huge.

The second article emphasizes the economic issues. Using the number of women who had mammograms in 1990, it is estimated that the cost is about half a billion dollars. It is argued that, in thinking about a national health plan, this is a lot of money to spend on testing that may not have a statistically significant effect.

Of course, some women under forty who develop breast cancer could have prevented it by such tests. Thus, women argue that they should not be treated as "statistics", but as human beings, and if they want to have a mammogram, the test should be paid for.

This shapes up to be a mighty battle between those who feel that, if the line is not held here, it will never be held, and health care costs will continue to spiral -- and a powerful movement among women to do everything possible to decrease the very high mortality rate of breast cancer.

Ever since Tversky and Gilovich (Chance, 2(1), Winter 1989, 16-21) challenged the evidence for streaks in basketball, sports fans have wanted to show that streaks exist. This article is the most ambitious effort to date. Albright obtained data on the hitting performance of all major league players who had at least 500 at bats in a given year. This gave him 501 season records, including four years of data for forty of the players. In addition to the outcome of each at bat, the data provide information on variables thought to be correlated with hitting performance such as home or away game, earned run average, etc.

Albright found only a couple of players with significantly streaky performance, as would be expected in 501 Bernoulli trials of length about 500. He concludes that "actual performance is being generated in a manner reasonably consistent with a model of randomness (biased coin-tossing)." Little support for streakiness came from correlating hitting and the additional variables.

The commentators made additional interesting analyses of the data and seemed less convinced than Albright that streakiness could not be shown from the data. You can obtain the data from Professor Albright (albright@indiana.edu) and do your own investigation. We did and it was great fun.

This article reviews the now familiar argument that the extremely small probabilities for a chance match in DNA fingerprinting may be suspect because of errors in measurement and because of an unjustified assumption of independence between different bands.

The article remarks that statisticians also point to this anomaly: The odds usually quoted represent the probability of getting a very good match between the DNA of a person not connected with the crime and the evidence. The jury, however, needs the probability of the accused's being unconnected with the crime, given a very good match between the accused's DNA and the evidence. The latter probability depends upon a prior estimate for the probability of guilt and can be significantly higher when other evidence is weak.

A discussion of the use of Bayes probabilities in DNA fingerprinting can be found in articles by Berry and Cohen in Chance, 3(3), 1990.

A. K. Dewdney, well known for his columns on Mathematical Games and on computing in Scientific American, has written a book on numeracy. Through the years his readers, whom he calls "abuse detectives", have sent him interesting examples of the abuse of mathematics from the press.

In this book, Dewdney presents many of these examples along with his ideas on what an average person should know about mathematics to avoid being misled by advertisements and news reports. Not surprisingly, statistics and probability play an important role in his examples.

The title of the book came from an advertisement which claimed that a new light-bulb would save 200 percent on electricity costs. A reader wrote to the company suggesting that he should be paid for using the light-bulbs since, if the claim is correct, he would be generating electricity.

Finally, a classic article from the past:

In 1982 Gould learned that he was suffering from abdominal mesothelioma, a rare and serious cancer usually associated with exposure to asbestos. Being the scientist that he is, he immediately went to the Harvard Medical Library to learn all about this disease. He learned that mesothelioma is incurable, with a median survival of only eight months after discovery. Gould remarks that most people without training in statistics would assume that this means that they would probably die within eight months.

From his own research he knew all too well the vagaries of variation from the mean and median. He verified his hunch that the distribution of length of life was right skewed. With additional information about himself -- he was young and had a strong desire to live -- he convinced himself that he had a good chance of being far out in the tail of the distribution. Indeed, when he was put on an experimental treatment protocol, he could envision another distribution altogether.

Students who read this article never forget what the median is and are delighted to learn that Gould is now very far into the tail of the distribution!

Return to Table of Contents | Return to the JSE Home Page