ERIC Identifier: ED355252
Publication Date: 1992-09-00
Author: Rudner, Lawrence M. - Shafer, Mary Morello
Source:
ERIC Clearinghouse on Tests Measurement and Evaluation Washington DC.
Resampling: A Marriage of Computers and Statistics. ERIC/TM
Digest.
Suppose your superintendent asked you to determine whether voucher students
are doing better than non-voucher students in your district's elementary
schools. You might perform a simple t test or an analysis of variance to find
your answer. You would report mean differences and probability levels. And if
your superintendent is like most non-statisticians, he or she would accept the
magic of statistics without questioning the validity of the assumptions made to
use the t test.
Thanks to advances in computer technology, educational researchers are
beginning to use simpler statistical methods. These techniques let us
empirically address a wider range of questions with smaller data sets and with
fewer, less restrictive assumptions. Using such techniques, we can focus on
reasoning and on understanding the data, not on complicated formulas and tables.
The techniques promise to make statistics a useful, easily learned tool for
educational policy makers and researchers.
This digest introduces computationally intensive statistics, collectively
called resampling techniques. After defining these statistics, we'll use one
technique to answer our opening question. We'll then present the arguments for
and against resampling.
RESAMPLING DEFINED
Resampling is simply a process for
estimating probabilities by conducting vast numbers of numerical experiments.
Today, resampling is done with the aid of high speed computers.
In Science News, Peterson (1991) compares resampling techniques to the
trial-and-error way gamblers once used to figure odds in card or dice games.
Before the invention of probability theory, gamblers would deal out many hands
of a card game to count the number of times a particular hand occurred. Thus, by
experimentation, gamblers could figure the odds of getting a certain hand in
their game.
Probability theory freed researchers from the drudgery of repeated
experiments. With a few assumptions, researchers could address a wide range of
topics. While the advances in statistics paved the way for elegant analysis, the
costs came high:
o We could analyze only certain types of statistics, such as the mean and
standard deviation.
o We had to make certain assumptions, like the normality assumption, about
the underlying distribution.
o And researchers needed specialized training to apply, understand, and
appreciate statistics.
But resampling techniques overcome all these limitations today:
o We can analyze virtually any statistic.
o We don't have to make any assumptions about the distribution of the data.
o And the techniques are easily to understand.
All resampling techniques rely on the computer to generate data sets from the
original data. The techniques differ, however, in how they generate the data
sets. Four techniques are important:
o the bootstrap, invented by Bradley Efron;
o the jacknife, invented by Maurice Quenouille and later developed by John W.
Tukey;
o cross-validation, developed by Seymour Geisser, Mervyn Stone, and Grace G.
Wahba; and
o balanced repeated replication, developed by Philip J. McCarthy.
A RESAMPLING EXAMPLE FOR EDUCATION
Back to the question
comparing the grades of voucher and non-voucher students: Using the bootstrap
technique, we can empirically construct the distribution of mean grade
differences for students in these two groups. If the observed difference is
unusual, then we would reject the null hypothesis that grades are unrelated to
voucher status.
For simplicity, let's assume that the district has 13 voucher students and 39
non-voucher students, and the mean difference is 10 standard score units. To
empirically construct the distribution, we'd follow these steps:
1. Create a data base with all the student grades.
2. Randomly sort the data base.
3. Compute the mean for the first 13 students.
4. Compute the mean for the other 39 students.
5. Record the test statistic--the absolute value of the mean difference.
6. Then repeat steps 2 though 5 many times.
That way, we'd get the distribution of mean differences when we randomly
select students. The probability of observing a mean difference of 10 when
everything is random is the proportion of experimental test statistics in step 5
that are greater than 10.
Noreen (1989) noted several striking aspects of this approach:
o Researchers make no assumptions about the distribution of grades (for
example, no normality assumption).
o The data are not a random sample from some population.
ARGUMENTS FOR RESAMPLING
Diaconis and Efron (1983) argue
that the resampling method frees researchers from two limitations of
conventional statistics: "the assumption that the data conform to a bell-shaped
curve and the need to focus on statistical measures whose theoretical properties
can be analyzed mathematically." Instead, Peterson says, this method "addresses
a key problem in statistics: how to infer the 'truth' from a sample of data that
may be incomplete or drawn from an ill-defined population."
The resampling method forces researchers to clarify the problem: With no
formulas to fall back on, you have to explicitly define the question you want to
answer. According to Simon and Bruce (1991), the method prevents researchers
from "simply grabbing the formula for some test without understanding why they
chose that test." As Peterson explains, instead of asking which formula to use,
you "begin tackling such questions as what makes certain results statistically
'significant.'"
In Scientific American, Diaconis and Efron apply the bootstrap method to
various types of problems and then compare the results from the bootstrap with
the results from conventional statistical tests, including the correlation
coefficient and principal components. Most of the time, the bootstrap method
yielded the same answers that the more conventional methods did. Of course, the
bootstrap may not give a true picture of every sample, just as conventional
tests sometimes find deceptive answers to problems.
Because resampling techniques like the bootstrap are so easy to use and
understand, Simon and Bruce advocate teaching these techniques to students
first--that way, students learn how to translate their "scientific" question
into a "statistical" question. By learning how to think clearly about their
problem, students won't "select their methods blindly."
They cite a study where one group of students learned resampling techniques,
and the other learned conventional methods. The students taught the resampling
techniques did much better solving statistical problems than the other students
taught conventional methods. Further, the students who learned the resampling
techniques enjoyed statistics, and their attitudes toward math improved during
the course. However, the attitudes of the students who learned conventional
techniques got worse during the course.
ARGUMENTS AGAINST RESAMPLING
Critics question the
resampling method itself. They argue, as Stephen E. Fienberg says, that "you're
trying to get something for nothing. You use the same numbers over and over
again until you get an answer that you can't get any other way. In order to do
that, you have to assume something, and you may live to regret that hidden
assumption later on" (Peterson, 1991, p. 57).
Other critics question the accuracy of the estimates that resampling
yields--if, for example, the researcher doesn't make enough experimental trials.
In some situations, resampling may be less accurate than conventional parametric
methods.
ADDITIONAL READING
The classic introduction to this field:
Diaconis, P., and B. Efron. (1983). Computer-intensive methods in statistics.
Scientific American, May, 116-130.
Numerous examples, as well as Basic, Pascal, and Fortran source code for
conducting several resampling experiments:
Noreen, Eric. (1989). Computer-intensive methods for testing hypothesis. New
York: Wiley.
Discusses using resampling methods to teach statistical concepts:
Peterson, I. (July 27, 1991). Pick a sample. Science News, 140, 56-58.
Low-cost IBM PC software for learning and applying resampling:
Simon, J. L. (1990). Resampling stats: Probability and statistics, a
radically different way. University of Maryland, College of Business and
Management.
Arguments for using and teaching resampling:
Simon, J. L., and P. Bruce. (1991). Resampling: A tool for everyday
statistical work. Chance, 4(1), 22-32.