ISSN 1069-1898


Volume 12 (2004)

Archive (1993-2003)


Data Archive

Information Service

Editorial Board


Data Contributors

Home Page

Contact JSE

ASA Publications

Search JSE

An International Journal on the Teaching and Learning of Statistics

JSE Volume 12, Number 2 Abstracts

Dominique Haughton and Nguyen Phong
Graphical and Numerical Descriptive Analysis: Exploratory Tools Applied to Vietnamese Data

This case study covers several exploratory data analysis ideas, the histogram and boxplot, kernel density estimates, the recently introduced bagplot - a two-dimensional extension of the boxplot - as well as the violin plot, which combines a boxplot with a density shape plot. We apply these ideas and demonstrate how to interpret the output from these tools in the context of data on living standards in Vietnam. The level of the presentation is suitable for an upper-level undergraduate or beginning graduate course in applied statistics. We use data from the Vietnam Living Standards Survey of 1998 (VLSS98) and from the 2000 Vietnam statistical yearbook, the statistical package Stata, and special programs provided by the authors who introduced the bagplot and the violin plot.

Key Words: Bagplots; Boxplots; Histograms; Kernal density estimators; Vietnam Living Standards Surveys, Violin plots.

Mervyn G. Marasinghe, William M. Duckworth, and Tae-Sung Shin
Tools for Teaching Regression Concepts Using Dynamic Graphics

This paper extends work on the construction of instructional modules that use graphical and simulation techniques for teaching statistical concepts (Marasinghe, et al. 1996; Iverson and Marasinghe 2001). These modules consist of two components: a software part and a lesson part. A computer program written in LISP-STAT with a highly interactive user interface that the instructor and the students can use for exploring various ideas and concepts comprises the software part. The lesson part is a prototype document providing guidance for instructors for creating their own lessons using the software module. This includes a description of concepts to be covered, instructions on how to use the module and some exercises. The regression modules described here are designed to illustrate various concepts associated with regression model fitting such as the use of residuals and other case diagnostics to check for model adequacy, the assessment of the effects of transforming the response variable on the regression fit using well-known diagnostic plots and the use of statistics to measure effects of collinearity on model selection.

Key Words: Active learning; Education, Lisp-Stat, Regression diagnostics, Simulation, Statistics instruction.

Mu Zhu and Arthur Y. Lu
The Counter-intuitive Non-informative Prior for the Bernoulli Family

In Bayesian statistics, the choice of the prior distribution is often controversial. Different rules for selecting priors have been suggested in the literature, which, sometimes, produce priors that are difficult for the students to understand intuitively. In this article, we use a simple heuristic to illustrate to the students the rather counter-intuitive fact that flat priors are not necessarily non-informative; and non-informative priors are not necessarily flat.

Key Words: Conjugate priors; Maximum likelihood estimation; Posterior mean.

Margaret H. Smith
A Sample/Population Size Activity: Is it the sample size or the sample size as a fraction of the population that matters?

Unless the sample encompasses a substantial portion of the population, the standard error of an estimator depends on the size of the sample, but not the size of the population. This is a crucial statistical insight that students find very counter-intuitive. After trying several ways of convincing students of the validity of this principle, I have finally found a simple memorable activity that convinces students beyond a reasonable doubt. As a bonus, the data generated by this activity can be used to illustrate the central limit theorem, confidence intervals, and hypothesis testing.

Key Words: Sample size; Sampling distribution; Standard error.

Kim I. Melton
Statistical Thinking Activities: Some Simple Exercises with Powerful Lessons

Statistical thinking is required for good statistical analysis. Among other things, statistical thinking involves identifying sources of variation. Students in introductory statistics courses seldom recognize that one of the largest sources of variation may come in the collection and recording of the data. This paper presents some simple exercises that can be incorporated into any course (not just statistics) to help studnets understand some of the sources of variation in data collection. Primary attention is paid to operational definitions used in the data collection process.

Key Words: Data collection; Operational definitions.

Cengiz Alacaci
Inferential Statistics: Understanding Expert Knowledge and its Implications for Statistics Education

This study investigated the knowledge base necessary for choosing appropriate statistical techniques in applied research. In this study, we compared knowledge used by six experts and six novices in two types of statistical tasks. The tasks were: 1) comparing research scenarios form the perspective of choosing a statistical technique, and 2) direct comparison of statistical techniques. The framework was based on expert knowledge in inferential statistics using the repertory grid technique for data collection. A qualitative analysis of data showed that of the three types of expert knowledge, research design knowledge comprised the biggest portion, with theoretical and procedural knowledge comprising relatively smaller parts. Little difference was observed between experts and novices in extensiveness of knowledge use, although experts' knowledge use was found to be more integrated than novices'. Finally, two implications were drawn regarding how to better teach selection skills in statistics education: (1) statistical techniques should be taught in relation to relevant research designs, and (2) conceptual connections between statistical techniques should be explicitly taught.

Key Words: Knowledge structure; Selection skills; Statistical expertise; Statistical literacy; Statistical techniques.

Datasets and Stories

Neil Binnie
Using EDA, ANOVA and Regression to Optimise some Microbiology Data

Bacteria are cultured in medical laboratories to identify them so patients can be treated correctly. The tryptone dataset contains measurements of bacteria counts following the culturing of five strains of Staphylococcus aureus. It also contains the time of incubation, temperature of incubation and concentration of tryptone, a nutrient. The question is whether the conditions recommended in the protocols for the culturing of these strains are optimal? The task is to find the incubation time, temperature and tryptone concentration that optimises the growth of this baterium. This data may be explored by students at several levels. Graphical methods can be used to investigate the relationship between the variables. ANOVA can be used with one-way, two-way and factorial models with interactions, to identify significant factors. Multiple polynomial regression methods can be used to model the data, with optimal conditions estimated by partial differentiation.

Key Words: Analysis of variance; Exploratory data analysis; Interactions; Multiple regression; Optimisation; Outlier; Polynomial regression.

Volume 12 (2004) | Archive | Index | Data Archive | Information Service | Editorial Board | Guidelines for Authors | Guidelines for Data Contributors | Home Page | Contact JSE | ASA Publications

Copyright © 2004 American Statistical Association. All rights reserved.