Student Ratings Offer Useful Input to Teacher
Evaluations. ERIC Digest.
by Scriven, Michael
Student ratings add a valuable component to the range of input for the
evaluation of teachers. Although many question the validity of such ratings,
under certain conditions, results can and should be useful.
Student ratings of instruction are widely used as a basis for personnel
decisions and faculty development recommendations in post-secondary education
today. This article addresses concerns about their validity and presents
a case for the use of student ratings in teacher evaluation. In this discussion,
student ratings refer to those in which students are asked to complete
a form or write a short free-form evaluation anonymously, either during
or immediately after a class period, the final exam, or a session after
grades are issued.
Oftentimes, student rating forms ask many questions about matters that
students do not appear to be in any position to judge reliably. In addition,
the fact that the overall rating of teaching merit by students is only
statistically related to learning gains is a concern if one believes that
statistical indicators should not be used to make personnel decisions.
Another concern is that the validation studies that are used to justify
student ratings use questionable indicators instead of the true criterion.
For example, some of them correlate the student ratings with peer ratings
of teacher merit instead of with superior learning gains.
ARGUMENTS FOR USING STUDENT RATINGS
There are several strong arguments for using student ratings to evaluate
teachers. (See figure titled "Nine Potential Sources of Validity for Student
Ratings of Instruction.") Students are in a unique position to rate their
own increased knowledge and comprehension as well as changed motivation
toward the subject taught. As students, they are also in a good position
to judge such matters as whether tests covered all the material of the
In addition, students can observe and rate facts (i.e. an instructor's
punctuality, the legibility of writing on the board) that are relevant
to competent teaching. They can also identify and rate whether the teacher
is enthusiastic. Does he or she ask many questions? Encourage questions
from students, etc.?
However, the possible lines of argument (see figure above) for the validity
of student ratings become invalid if the rating form used is not appropriate
for the specific data collection required. Since rating forms vary widely,
generalizations about student ratings as a good indicator of learning gains
or teacher merit are misleading since they assume there is a common property
to all such ratings. Most forms, when used in the most common ways, are
invalid as a basis for personnel action. For example, many forms used to
make personnel decisions ask questions that may influence the respondent
by mentioning extraneous and potentially prejudicial material (i.e., questions
about the teacher's personality or the appeal of the subject matter).
Another problem with the use of rating forms for summative evaluation
is that many of them ask the wrong global or overall questions. This is
important since it is typically these questions on which most personnel
decisions are based. Common examples of this kind of mistake include forms
that ask for
- comparisons with other teachers,
- whether the respondent would recommend the course to a friend with
similar interests, or
- whether "it's one of the best courses" one has had.
Several pragmatic considerations (logistical, political, economic, psychological),
which impact form design, are *required* for validity. These include:
- Form length--if forms are too long students may not fill them in or
may skip responses.
- Type of question--forms should include the questions students want
answered about the courses they are considering taking, thus avoiding resentment
and a lack of willingness to complete the forms; *forms should not include*
questions that students suspect will be used to discriminate against them
or that are biased towards favorable (or unfavorable) comments.
The validity of student rating forms is also dependent on the context
of how and when they are administered. For student rating results to be
valid, they must be obtained from properly administered tests, stringently
controlled data collection, and thorough analysis of test results. Frequent
- The use of instructors to collect forms rating their own instructional
- Lack of controls over pleas for sympathy or indulgence by the teacher
before forms are distributed.
- Inadequate time to complete forms.
- Failing to ensure an acceptable return rate.
To ensure the validity of results, errors in data processing, report
design, and interpretation must also be avoided. Common errors include:
- The use of averages alone, without regard to the distribution;
- Failure to set up appropriate comparison groups so that the usual
tendency for ratings to be higher in graduate professional schools can
be taken into account;
- Treating small differences as significant, just because they are statistically
- Using factor analysis without logical/theoretical validation;
- Ignoring ceiling/floor effects;
- Using the ratings as the sole basis for either formative or summative
Although student ratings are an important source of data for the evaluation
of teaching merit, they should not be the only source. Similarly, student
ratings form an essential part of the data for the evaluation of courses,
workshops, degree programs, etc., but they cannot carry the entire burden.
It is essential to look at the data relating to other dimensions of merit
such as needs, demand, opportunities for symbiosis, content, and costs,
and estimate their relative importance.
Student ratings must be considered very carefully in the context in
which they are given. The educational administrator interested in the improvement
of instruction--whether by improving courses themselves, or the performance
or the composition of the faculty--and instructors and students with the
same interest will benefit from the use of a sound system of student ratings.
NINE POTENTIAL SOURCES OF VALIDITY FOR STUDENT
RATINGS OF INSTRUCTION
1. The positive and statistically significant correlation of student
ratings with learning gains.
2. The unique position and qualifications of the students in rating
their own increased knowledge and comprehension.
3. The unique position of the students in rating changed motivation
(a) toward the subject taught; perhaps also (b) toward a career associated
with that subject; and perhaps also (c) with respect to a changed general
attitude toward further learning in the subject area, or more generally.
4. The unique position of the students in rating observable matters
of fact relevant to competent teaching, such as the punctuality of the
instructor and the legibility of writing on the board.
5. The unique position of the students in identifying the regular presence
of teaching style indicators. Is the teacher enthusiastic; does he or she
ask many questions, encourage questions from students, etc.?
6. Relatedly, students are in a good position to judge--although it
is not quite a matter of simple observation--such matters as whether tests
covered all the material of the course.
7. Students as consumers are likely to be able to report quite reliably
to their peers on such matters of interest to them as the cost of the texts,
the extent to which attendance is taken and weighted, and whether a great
deal of homework is required--considerations that have little or no known
bearing on the quality of instruction.
8. Student ratings represent participation in a process often represented
as "democratic decisionmaking."
9. The "best available alternative" line of argument. This digest was
condensed from "Using Student Ratings in Teacher Evaluation," by Dr. Michael
Scriven, Project Director, Teacher Evaluation Models Project, Center for
Research on Educational Accountability and Teacher Evaluation (CREATE)
Abrami, P.C.(1989). How Should We Use Student Ratings to Evaluate Teaching?
"Research in Higher Education," 30 (2), 221-227.
Abrami, P.C., d'Apollonia, S., & P.A. Cohen (1990). Validity of
Student Ratings of Instruction: What We Know and What We Do Not Know. "Journal
of Educational Psychology," 82 (2), 219-231.
L'Hommedieu, R. Menges, R.J. & K.T. Brinko (1990) Methodological
Explanations for the Modest Effects of Feedback from Student Ratings. "Journal
of Educational Psychology," 82 (2), 232-241.
Scriven, M. (1994) Using Student Ratings in Teacher Evaluation, "Evaluation
Perspectives" (Newsletter of The Center for Research on Educational Accountability
and Teacher Evaluation), 4(1), 1-4.