A CURRICULUM FRAMEWORK FOR
PREK-12 STATISTICS EDUCATION
Writers
Christine Franklin
Gary Kader
Denise S. Mewborn
Jerry Moreno
Roxy Peck
Mike Perry
Richard Schaeffer
Advisors
Susan Friel
Landy Godbold
Brad Hartlaub
Peter Holmes
Cliff Konold
Presented to the American Statistical Association
Board of Directors for Endorsement
March 2005
Table of Contents
Introduction . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Level A . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 21
Level B . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
Level C . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
References. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
Appendix for Level A . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Appendix for Level B . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Appendix for Level C . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
A CURRICULUM FRAMEWORK FOR
PREK-12 STATISTICS EDUCATION
Introduction
The Ultimate Goal: Statistical Literacy
Citizenship
Personal Choices
Statistical literacy is required for daily personal choices. Statistics provides information on the composition of foods and thus inform our choices at the grocery store. Statistics helps to establish the safety and effectiveness of drugs to help us choose a treatment. Statistics helps to establish the safety of toys to assure that our little ones are not at risk. Our investment choices are guided by a plethora of statistical information about stocks and bonds. The Nielsen ratings decide which shows will survive on television and thus affect what is available. Many products have a previous statistical history and our choices of products can be affected by awareness of this history. The design of an automobile is aided by anthropometrics, the statistics of the human body, to enhance passenger comfort. Statistical ratings of fuel efficiency, safety and reliability are available to help us select a vehicle.
The Workplace and
Professions
Efforts to improve quality and accountability are prominent among the many ways that statistical thinking and tools can be used to enhance productivity. The competitive marketplace demands quality. Quality control practices such as the statistical monitoring of design and manufacturing processes identify where improvement can be made and lead to better product quality. Systems of accountability can help produce more effective employees and organizations, but many accountability systems now in place are not based on sound statistical principles and may, in fact, have the opposite effect from the one desired. Good accountability systems require proper use of statistical tools to determine and apply appropriate criteria.
Science
The Federal Drug Administration requires extensive testing of drugs to determine effectiveness and side effects before they can be sold. A recent advertisement for a drug designed to reduce blood clots stated ÒPLAVIX, added to aspirin and your current medications, helps raise your protection against heart attack or strokeÓ. But the advertisement also warns that ÒThe risk of bleeding may increase with PLAVIX...Ó
This was determined by a clinical trial involving over 12,000 subjects. Among the 6259 taking PLAVIX + aspirin 3.7% showed major bleeding problems while only 2.7% of the 6303 taking the placebo had major bleeding. This is viewed as a Òstatistically significantÓ result.
Statistical literacy involves a healthy dose of skepticism about ÒscientificÓ findings. Is the information about side effects of PLAVIX treatment reliable? A statistically literate person should ask such questions and be able to answer them intelligently. A statistically literate high school graduate will be able to understand the conclusions from scientific investigations and to offer an informed opinion about the legitimacy of the reported results. To quote from Mathematics and Democracy: The Case for Quantitative Literacy (Steen, 2001), such knowledge Òempowers people by giving them tools to think for themselves, to ask intelligent questions of experts, and to confront authority confidently. These are skills required to survive in the modern worldÓ.
Summary
The Case for Statistics Education
Over the past quarter century, statistics (often labeled
data analysis and probability) has become a key component of the K-12
mathematics curriculum. Advances in
technology and in modern methods of data analysis of the 1980Õs, coupled with
the data richness of society in the information age, led to the development of
curriculum materials geared toward introducing statistical concepts into the
school curriculum as early as the elementary grades. This grass-roots effort was given
sanction by the National Council of Teachers of Mathematics (NCTM) when their
influential document Curriculum and
Evaluation Standards for School Mathematics (NCTM, 1989), included Data
Analysis and Probability as one of the five content strands. As this document and its 2000
replacement entitled Principles and
Standards for School Mathematics (NCTM, 2000) became the basis for reform
of mathematics curricula in many states, the acceptance of and interest in
statistics as part of mathematics education gained strength. In recent years many mathematics
educators and statisticians have devoted large segments of their careers to the
improvement in statistics education materials and pedagogical techniques.
NCTM is not the only group calling for improved statistics education beginning at the school level. The National Assessment of Educational Progress (NAEP, 2005) is developed around the same strands as in the NCTM Standards, with data analysis and probability questions playing an increasingly prominent role in the NAEP exam.
The emerging quantitative literacy movement calls for greater emphasis on practical quantitative skills that will help assure success for high school graduates in life and work; many of these skills are statistical in nature. To quote from Mathematics and Democracy: The Case for Quantitative Literacy (Steen, 2001):
á
Quantitative
literacy, also called numeracy, is the natural tool for comprehending
information in the computer age. The expectation that ordinary citizens be
quantitatively literate is primarily a phenomenon of the late twentieth
century.
á Unfortunately, despite years of study and life experience in an environment immersed in data, many educated adults remain functionally illiterate.
á Quantitative literacy empowers people by giving them tools to think for themselves, to ask intelligent questions of experts, and to confront authority confidently. These are the skills required to thrive in the modern world.
A recent study entitled Ready or Not: Creating a High School Diploma That Counts from the American Diploma Project recommends "must have" competencies needed for high school graduates "to succeed in postsecondary education or in high-performance, high- growth jobs" include, in addition to algebra and geometry, aspects of data analysis, statistics, and other applications that are vitally important for other subjects as well as for employment in today's data-rich economy.
Statistics education as proposed in this Framework can enable the "must have" competencies for graduates to Òthrive in the modern worldÓ.
NCTM Standards and the
Framework
The main objective of this document is to provide a
conceptual Framework for K-12
statistics education. The foundation for this Framework rests on the NCTM Principles
and Standards for School Mathematics (2000).
The Framework is intended to support the objectives of the NCTM Principles and Standards. It is intended to complement the NCTM recommendations, not to supplant them.
The NCTM Principles and Standards describes the statistics content strand as follows.
Instructional programs from pre-kindergarten through grade 12 should enable all students toÑ
á formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them;
á select and use appropriate statistical methods to analyze data;
á develop and evaluate inferences and predictions that are based on data;
á understand and apply basic concepts of probability.
The Data Analysis and Probability Standard recommends that students formulate questions that can be answered using data and addresses what is involved in gathering and using the data wisely. Students should learn how to collect data, organize their own or others' data, and display the data in graphs and charts that will be useful in answering their questions. This Standard also includes learning some methods for analyzing data and some ways of making inferences and drawing conclusions from data. The basic concepts and applications of probability are also addressed, with an emphasis on the way that probability and statistics are related.
The NCTM Standards elaborates on these themes somewhat and provides examples of the types of lessons and activities that might be used in a classroom. More complete examples can be found in the NCTM Navigation Series on Data Analysis and Probability (2002-2004). Statistics, however, is a relatively new subject for many teachers who have not had an opportunity to develop sound knowledge of the principles and concepts underlying the practices of data analysis that they are now called upon to teach. These teachers do not clearly understand the difference between statistics and mathematics. They do not see the statistics curriculum for grades K-12 as a cohesive and coherent curriculum strand. These teachers may not see how the overall statistics curriculum provides a developmental sequence of learning experiences.
This Framework provides a conceptual structure for statistics education which gives a coherent picture of the overall curriculum. This structure adds to but does not replace the NCTM recommendations.
The Difference between
Statistics and Mathematics
"Statistics is a methodological discipline. It exists not for itself but rather to offer to other fields of study a coherent set of ideas and tools for dealing with data. The need for such a discipline arises from the omnipresence of variability". (Cobb and Moore, 1997)
A major objective of statistics education is to help students develop statistical thinking. Statistical thinking, in large part, must deal with this omnipresence of variability; statistical problem solving and decision making depend on understanding, explaining and quantifying the variability in the data.
It is this focus on variability in data that sets statistics apart from mathematics.
The Nature of Variability
There are many different sources of variability in data. Some of the important sources are described below.
Measurement Variability
Natural Variability
Variability is inherent in nature. Individuals are different. When we measure the same quantity across several individuals we are bound to get some differences in the measurements. Although some of this may be due to our measuring instrument, most of it is simply due to the fact that individuals differ. People naturally have different heights, different aptitudes and abilities, or different opinions and emotional responses. When we measure any one of these traits we are bound to get variability in the measurements. Different seeds for the same variety of bean will grow to different sizes when subjected to the same environment because no two seeds are exactly alike; there is bound to be variability from seed to seed in the measurements of growth.
Induced Variability
If we plant one pack of bean seeds in one field, and another pack of seeds in another location with a different climate, then an observed difference in growth among the seeds in one location with those in the other might be due to inherent differences in the seeds (natural variability) or the observed difference might be due to the fact that the locations are not the same. If one type of fertilizer is used on one field and another type on the other, then observed differences might be due to the difference in fertilizers. For that matter, the observed difference might be due to a factor that we haven't even thought about. A more carefully designed experiment can help us to determine the effects of different factors.
This one basic idea, comparing natural variability to the variability induced by other factors, forms the heart of modern statistics. It has allowed medical science to conclude that some drugs are effective and safe, whereas others are ineffective or have harmful side effects. It has been employed by agricultural scientists to demonstrate that a variety of corn grows better in one climate than another, that one fertilizer is more effective than another, or one type of feed is better for beef cattle than another.
Sampling Variability
In a voter poll, it seems reasonable to use the proportion of voters surveyed (a sample statistic) as an estimate of the unknown proportion of all voters who support a particular candidate. But if a second sample of the same size is used, it is almost certain that there would not be exactly the same proportion of voters in the sample who support the candidate. The value of the sample proportion will vary from sample to sample. This is called sampling variability. So what is to keep one sample from estimating that the true proportion is .60 and another from saying it is .40 ? This is possible but unlikely if proper sampling techniques are used. Poll results are useful because these techniques and an adequate sample size can assure that unacceptable differences among samples are quite unlikely.
An excellent discussion on the nature of variability is given in (Utts,1999).
The Role of Context
"The focus on variability naturally gives statistics a
particular content that sets it apart from mathematics itself and from other
mathematical sciences, but there is more than just content that distinguishes
statistical thinking from mathematics. Statistics requires a different kind of thinking, because data are not just numbers, they are numbers
with a context". (Cobb and Moore,1997)
Many mathematics problems arise from applied contexts, but the context is removed to reveal mathematical patterns.
Statisticians, like mathematicians, look for patterns, but the meaning of the patterns depends on the context.
"In mathematics, context obscures structure. In data analysis, context provides meaningÓ.
(Cobb and Moore, 1997)
A graph, which appears occasionally in the business section of newspapers, shows a plot of the Dow Jones Industrial Average (DJIA) over a ten-year period. The variability of stock prices draws the attention of an investor. This stock index may go up or down over some intervals of time, may fall or rise sharply over a short period. In context the graph raises questions. A serious investor is not only interested in when or how rapidly the index goes up or down, but also why. What was going on in the world when the market went up, what was going on when it went down? But strip away the context. Remove time (years) from the horizontal axis and call it "X", remove stock value (DJIA) from the vertical axis and call it "Y", and there remains a graph of very little interest or mathematical content!
Probability
Probability is a tool for statistics
Probability is an important part of any mathematical education. It is a part of mathematics that enriches the subject as a whole by its interactions with other uses of mathematics. Probability is an essential tool in applied mathematics and mathematical modeling. It is also an essential tool in statistics.
The use of probability as a mathematical model and the use of probability as a tool in statistics employ not only different approaches, but also different kinds of reasoning.
Two problems and the nature of the solutions will illustrate the difference.
Problem 1
Assume a coin is "fair" .
Question: If we toss the coin 5 times, how many heads will we get?
Problem 2
You pick up a coin.
Question: Is this a fair coin?
Problem 1 is mathematical probability problem.
Problem 2 is a statistics problem that can use the mathematical probability model determined in problem 1 as a tool to seek a solution.
The answer to neither question is deterministic. Coin tossing produces random outcomes, which suggests that the answer is probabilistic. The solution to problem 1 starts with the assumption that the coin is fair and proceeds to logically deduce the numerical probabilities for each possible number of heads 0,1, ....,5.
The solution to problem 2 starts with an unfamiliar coin; we don't know if it is fair or biased. The search for an answer is experimental - toss the coin and see what happens. Examine the resulting data to see if it looks like it came from a fair coin or a biased coin. There are several possible approaches, including: Toss the coin 5 times and record the number of heads. Then do it again: Toss the coin 5 times and record the number of heads. Repeat 100 times. Compile the frequencies of outcomes for each possible number of heads. Compare these results to the frequencies predicted by the mathematical model for a fair coin in problem 1. If the empirical frequencies from the experiment are quite dissimilar from those predicted by the mathematical model for a fair coin and are not likely to be caused by random variation in coin tosses, then we conclude the coin is not fair. In this case we induce an answer by making a general conclusion from observations of experimental results.
Probability and Chance Variability
Two important uses of "randomization" in statistical work occur in sampling and experimental design. When sampling we "select at random" and in experiments we randomly assign individuals to different treatments". Randomization does much more than remove bias in selections and assignments. Randomization leads to chance variability in outcomes that can be described with probability models.
The probability of something says about what percentage of the time it is expected to happen when the basic process is repeated over and over again.
Probability theory does not say very much about one toss of the coin; it makes predictions about the long-run behavior of the coin tosses.
Probability tells us little about the consequences of random selection for one sample but describes the variation we expect to see in samples when the sampling process is repeated a large number of times.
Probability tells us little about the consequences of random assignment for one experiment but describes the variation we expect to see in the results when the experiment is replicated a large number of times.
When randomness is present, the statistician wants to know if the observed result is due to chance, or something else. This is the idea of statistical significance.
The Role of Mathematics
in Statistics Education
The evidence that statistics is different from mathematics is not presented to argue that mathematics is not important to statistics education or that statistics education should not be a part of mathematics education. To the contrary, statistics education becomes increasingly mathematical as the level of understanding goes up.
But data collection design, exploration of data, and the interpretation of results should be emphasized in statistics education for statistical literacy. These are heavily dependent on context, but at the introductory level involve limited formal mathematics.
Probability plays an important role in statistical analysis, but formal mathematical probability should have its own place in the curriculum. Pre-college statistics education should emphasize the ways that probability is used in statistical thinking; an intuitive grasp of probability will suffice at these levels.
The Framework
Underlying Principles
Statistical Problem
Solving
Statistical problem solving is an investigative process that involves four components:
Formulate Questions
á clarify the problem at hand
á formulate one (or more) questions that can be answered with data
Collect Data
á design a plan to collect appropriate data
á employ the plan to collect the data
Analyze Data
á select appropriate graphical or numerical methods
á use these methods to analyze the data
Interpret Results
á interpret the analysis
á relate the interpretation to the original question.
The Role of Variability in the Problem Solving Process
Formulate Question
Anticipating Variability -Making the statistics question distinction
The formulation of a statistics question requires an understanding of the difference between a question that anticipates a deterministic answer and a question that anticipates an answer based on data that vary.
The question "How tall am I?" will be answered
with a single height. It is not a statistics question. The question "How
tall are adult men in the
The poser of the question "How does sunlight affect the growth of a plant?" should anticipate that the growth of two plants of the same type exposed to the same sunlight will likely differ. This is a statistics question.
The anticipation of variability is the basis for understanding of the statistics question distinction; these are required for proper question formulation.
Collect Data
Acknowledging Variability -Designing for differences
Data collection designs must acknowledge variability in data and frequently are intended to reduce variability. Random sampling is intended to reduce the differences between sample and population, and the sample size influences the effect of sampling variability (error). Experimental designs are chosen to acknowledge the differences between groups subjected to different treatments. Random assignment to the groups is intended to reduce differences between the groups due to factors that are not manipulated in the experiment. Some experimental designs pair subjects so that they are similar. Twins are frequently paired in medical experiments so that observed differences might be more likely attributed to the difference in treatments rather than differences in the subjects.
The understanding of data collection designs that acknowledge differences is required for effective collection of data.
Analyze Data
Accounting of Variability-Using
Distributions
The main purpose of statistical analysis is to give an accounting of the variability in the data. When results of an election poll state that "42% of those polled support a particular candidate with margin of error +/- 3% at the 95% confidence levelÓ, the focus is on sampling variability. The poll gives an estimate of the support among all voters. The margin of error indicates how far the sample result (42%+/-3%) might differ from the actual percentage of all voters who support the candidate. The confidence level tells us how often estimates produced by the method employed will produce correct results. This analysis is based on the distribution of estimates from repeated random sampling.
When test scores are described as "normally distributed with mean 450 and standard deviation 100" the focus is on how the scores differ from the mean. The normal distribution describes a bell-shaped pattern of scores and the standard deviation indicates the level of variation of the scores from the mean.
Accounting for variability with the use of distributions is the key idea in the analysis of data.
Interpret Results
Allowing for Variability-Looking beyond
the data
Statistical interpretations are made in the presence of variability and must allow for it.
The result of an election poll must be interpreted as an estimate that can vary from sample to sample. The generalization of the poll results to the entire population of voters looks beyond the sample of voters surveyed and must allow for the possibility of variability of results among different samples. The results of a randomized comparative medical experiment must be interpreted in the presence of variability due to the fact that different individuals respond differently to the same treatment as well as the variability due to randomization. The generalization of the results looks beyond the data collected from the subjects who participated in the experiment and must allow for these sources of variability.
Looking beyond the data to make generalizations must allow for variability in the data.
Maturing over Levels
The mature statistician understands the role of variability in the statistical problem solving process. At the point of question formulation, the statistician anticipates the data collection, the nature of the analysis, and the possible interpretations, all of which must consider possible sources of variability. In the end, the mature practitioner reflects upon all aspects of data collection and analysis as well as the question itself when interpreting results. Likewise he links data collection and analysis to each other and the other two components.
The beginning student cannot be expected to make all of these linkages. They require years of experience as well as training. Statistical education should be viewed as a developmental process. To meet the proposed goals, this report will provide a framework for statistical education over three levels. If the goal were to produce a mature practicing statistician, there would certainly be several levels beyond these. There is no attempt to tie these levels to specific grade levels.
The Framework uses three developmental Levels, A, B, and C. Although these three levels may parallel grade levels, they are based on development, not age. Thus, a middle school student who has had no prior experience with statistics will need to begin with Level A concepts and activities before moving to Level B. This holds true for a secondary student as well. If a student hasn't had Level A and B experiences prior to high school, then it is not appropriate to jump into Level C expectations. The learning is more teacher-driven at Level A, but becomes student driven at Levels B and C.
The Framework Model
The conceptual structure for statistics education is provided in the two-dimensional model shown in Figure 1. One dimension is defined by the problem-solving process components plus the nature of the variability considered and how we focus on variability. The second dimension is comprised of the three developmental levels.
Each of the first four rows describes a process component as it develops across levels. The fifth row indicates the nature of the variability considered at a given level. It is understood that work at Level B assumes and develops further the concepts from Level A, and likewise Level C assumes and uses concepts from the lower levels.
Figure 1: The Framework
|
Process Component |
Level A |
Level B |
Level C |
|
Formulate Question |
Beginning awareness
of the statistics question distinction Teachers pose questions of interest. Questions restricted to classroom |
Increased awareness
of the statistics question distinction. Students begin to pose their own questions of interest. Questions not restricted to classroom |
Students can make
the statistics question distinction. Students pose their own questions of interest. Questions seek generalization |
|
Collect Data |
Do not yet design for differences Census
of classroom Simple experiment |
Beginning awareness of design for differences Sample surveys Begin to use random selection Comparative experiment Begin to use random allocation |
Students make designs for differences Sampling designs with random selection Experimental designs with randomization |
|
Process Component |
Level A |
Level B |
Level C |
|
Analyze Data |
Use particular properties of distributions in context of specific example Display variability within a group Compare individual to individual Compare individual to group |
Learn to use particular properties of distributions as tools of analysis Quantify variability within a group Compare group to group in displays Acknowledge sampling error Some quantification of association Simple models for association |
Understand and use distributions in analysis as a global concept Measure variability within a group Measure variability between groups Compare group to group using displays and measures of variability
Describe and quantify sampling error Quantification of association Fitting of Models for association |
|
Process Component |
Level A |
Level B |
Level C |
|
Interpret Results |
Do not look beyond the data No generalization beyond the classroom Note difference between two individuals with different conditions Observe association in displays |
Acknowledge that looking beyond the data is feasible Acknowledge that a sample may or may not be representative of larger population Note difference between two groups with different conditions Aware of distinction between observational study and experiment Note differences in strength of association Basic interpretation of models for association Aware of the distinction between ÒassociationÓ and Òcause and effectÓ |
Are
able to look beyond the data in some contexts Generalize from sample to population Aware of the effect of randomization on the results of experiments Understand the difference between observational studies and experiments Interpret measures of strength of association Interpret models for association Distinguishes between conclusions from association studies and experiments. |
|
Process Component |
Level A |
Level B |
Level C |
||||
|
Nature of Variability Focus on Variability |
Measurement variability Natural variability Induced variability Variability within a group |
Sampling variability Variability within a group and variability between groups Co-variability |
Chance variability. Variability in model fitting |
|
|||
Illustrations
All four steps of the problem solving process are used at all three levels, but the depth of understanding and sophistication of methods used increases across the Levels A, B, C. This maturation in understanding the problem solving process and its underlying concepts is paralleled by an increasing complexity in the role of variability. The illustrations of learning activities given here are intended to clarify the differences across the developmental levels for each component of the problem solving process. A later section in this report will give illustrations of the complete problem solving process for learning activities at each level.
Formulate Question
Example 1
A: How long are the words on this page?
B: Are the words in a chapter of a fifth grade book longer than the words in a chapter of a third grade book?
C: Do fifth grade books use longer words than third grade books?
Example 2
A: What type of music is most popular among students in our class?
B: How do the favorite types of music compare among different classes?
C: What type of music is most popular among students in our school?
Example 3
A: In our class, are the heights and arm spans of students approximately the same?
B: Is the relationship between arm span and height for the students in our class the same as the relationship between arm span and height for the students in another class?
C: Is height a useful predictor of arm span for the students in our school?
Example 4
A: Will a plant placed by the window grow taller than a plant placed away from the window?
B: Will five plants placed by the window grow taller than five plants placed away from the window?
C: How does the level of sunlight affect the growth of a plant?
Collect Data
Example 1
A: How long are the words on this page?
The length of every word on the page is determined and recorded.
B: Are the words in a chapter of a fifth grade book longer than the words in a chapter of a third grade book?
A simple random sample of words from each chapter is used.
C: Do fifth grade books use longer words than third grade books?
Other sampling designs are considered, compared and some are used. For example, rather than select words in a simple random sample, a simple random sample of pages from the book is selected and all of the words on the pages chosen are used for the sample.
Note- At each level, issues of measurement should be addressed. The length of word depends on the definition of ÒwordÓ. For instance, is a number a word? Consistency of definition is important to reduce measurement variability.
Example 2
A: Will a plant placed by the window grow taller than a plant placed away from the window?
A seedling is planted in a pot that is placed on the window sill. A second seedling of the same type and size is planted in a pot that is placed away from the window sill. After six weeks the change in height for each is measured and recorded.
B: Will five plants of a particular type placed by the window grow taller than five plants of the same type placed away from the window?
Five seedlings of the same type and size are planted in a pan which is placed on the window sill. Five seedlings of the same type and size are planted in a pan which is placed away from the window sill. Random numbers are used to decide which plants go in the window. After six weeks the change in height for each seedling is measured and recorded.
C: How does the level of sunlight affect the growth of plants?
Fifteen seedlings of the same type and size are selected. Three pans are used, with five of these seedlings planted in each. Fifteen seedlings of another variety are selected to determine if the effect of sunlight is the same on different types of plants. Five of these are planted in each of the three pans. The three pans are placed in locations with three different levels of light. Random numbers are used to decide which plants go in which pan. After six weeks the change in height for each seedling is measured and recorded.
Note- At each level, issues of measurement should be addressed. The method of measuring change in height must be clearly understood and applied in order to reduce measurement variability.
Analyze Data
Example 1
A: What type of music is most popular among students in our class?
A bar graph is used to display the number of students who choose each music category.
B: How do the favorite types of music compare among different classes?
For each class, a bar graph is used to display the percentage of students who choose each music category. The same scales are used for both graphs so that they can easily be compared.
C: What type of music is most popular among students in our school?
A bar graph is used to display the percentage of students who choose each music category. Because a random sample is used, an estimate of the margin of error is given.
Note- At each level, issues of measurement should be addressed. A questionnaire will be used to gather studentsÕ music preferences. The design and wording of the questionnaire must be carefully considered to avoid possible biases in the responses. The choice of music categories could also affect results.
Example 2
A: In our class, are the heights and arm spans of students approximately the same?
The difference between height and arm span is determined for each individual.
An X-Y plot is constructed with X=height, Y=arm span. The line Y=X is drawn on this graph.
B: Is the relationship between arm span and height for the students in our class the same as the relationship between arm span and height for the students in another class?
For each class, an X-Y plot is constructed with X=height, Y=arm span. An "eye ball" line is drawn on each graph to describe the relationship between height and arm span. The equation of this line is determined. An elementary measure of association is determined.
C: Is height a useful predictor of arm span for the students in our school?
The least squares regression line is determined and assessed for use as a prediction model.
Note- At each level, issues of measurement should be addressed. The methods used to measure height and arm span must be clearly understood and applied in order to reduce measurement variability. For instance, do we measure height with shoes on or off?
Interpret Results
Example 1
A: How long are the words on this page?
The frequency plot of all word lengths is examined and summarized. In particular, students will note the longest and shortest word lengths, the most common lengths and least common lengths, and the length in the middle.
B: Are the words in a chapter of a fifth grade book longer than the words in a chapter of a third grade book?
The students interpret a comparison of the distribution of a sample of word lengths from the fifth grade book with the distribution of word lengths from the third grade book using a boxplot to represent each of these. The students also acknowledge that samples are being used which may or may not be representative of the complete chapters.
The boxplot for a sample of word
lengths from the fifth grade book is placed beside the boxplot of the sample
from the third grade book.
C: Do fifth grade books use longer words than third grade books?
The interpretation at Level C includes the interpretation at Level B, but also must consider generalizing from the books included in the study to a greater population of books.
Example 2
A: Will a plant placed by the window grow taller than a plant placed away from the window?
In this simple experiment, the interpretation is just a matter of comparing one measurement of change in size to another.
B: Will five plants placed by the window grow taller than five plants placed away from the window?
In this experiment, the student must interpret a comparison of one group of five measurements with another group.
If a difference is noted, then the student acknowledges that is likely caused by the differences in light conditions.
C: How does the level of sunlight affect the growth of a plant?
There are several comparisons of groups possible with this design. If a difference is noted, then the student acknowledges that it is likely caused by the differences in light conditions or the differences in types of plants. It is also acknowledged that the randomization used in experiment can possibly cause some of the observed differences.
Nature of Variability
Variability Within a Group
This is the only type considered at Level A. In Example 1, differences among word lengths on a single page are considered; this is variability within a group of word lengths. In Example 2, differences among how many students choose each category of music are considered; this is variability within a group of frequencies.
Variability Within a Group and Variability Between Groups
At Level B, students begin to make comparisons of groups of measurements. In Example 1, a group of word lengths from a fifth grade book are compared to a group from a third grade book. Such a comparison not only notes differences between the two groups such as the difference between median or mean word lengths, but must also take into consideration how much word lengths differ within each group.
Induced Variability
In Example 4, Level B, the experiment is designed to determine if there will be a difference between the growth of plants in sunlight and the growth of those away from sunlight. We want to determine if an imposed difference on the environments will induce a difference in growth.
Sampling Variability
In Example 1, Level B, samples of words from a chapter are used. Students observe that two different samples will produce different groups of word lengths. This is sampling variability.
Co-variability
Example 3, Level B or C, investigates the "statistical" relationship between height and arm span. The nature of this statistical relationship is described in terms of how the two variables "co-vary". For instance, if the heights of two students differ by 2 centimeters then we would like for our model of the relationship to tell us by how much we might expect their arm spans to differ.
.
Random Variability from Sampling
When random selection is used, then differences between samples will be random. Understanding this random variation is what leads to the predictability of results. In Example 2, Level C, this random variation is not only considered but it is also the basis for understanding the concept of margin or error.
Random Variability Resulting from Assignment to Groups in Experiments