ERIC Identifier: ED314428
Publication Date: 1989-12-00
Author: Echternacht, Gary
Source: ERIC Clearinghouse on
Tests Measurement and Evaluation Washington DC., American Institutes for
Research Washington DC.
Interpreting Test Scores for Compensatory Education Students.
To follow the rules and regulations of compensatory education programs
correctly, you must use objective measures when you select students for
programs, assess their progress, and monitor the program's quality. Because you
have this pressure to use standardized test scores, you should make sure that
you use the tests correctly.
In this digest, I point to four practices that administrators often
mistakenly follow when they use test scores:
o using test scores alone to select students for programs,
o giving out-of-level tests,
o misinterpreting grade-level, and
o failing to differentiate the degree of error in individual and group
Although these practices may not be widespread, they are serious.
DON'T USE TEST SCORES ALONE TO SELECT STUDENTS FOR
Program regulations for Chapter 1 require that you select students
by using objective measures. In addition, state departments of education
sometimes impose other requirements--for example, a program can serve only
students who score below the 40th percentile rank or all students who score
below the 20th percentile rank.
These requirements often lead administrators to select students on the basis
of test scores alone because
o the requirements are stated in terms of test scores, and
o when program monitors review programs, they appraise them in terms of state
and federal regulations.
Nevertheless, you should not make a decision about an individual student by
using a test score by itself. It is acceptable to use test scores to make
decisions in a sequence of assessments, but it is unacceptable to use test
scores by themselves in a sequence of one assessment. You are unfair to students
if you simply say that all students who score below the 40th percentile rank are
in the program and all who score above the 40th percentile rank are ineligible.
You must remember that test scores are neither completely reliable nor valid
indicators of academic performance. For example, if students take an equivalent
form of a test at different times, their scores will change somewhat. This
unreliability is important for those whose scores are near the cut-off score for
selection because if you administer the same test a second time, some students
who previously scored below a cut-off may score above the cut-off a second time.
Similarly, reading tests give you only general measures of reading ability.
Some students may be good readers in certain content areas, yet they may score
poorly on a given test because the reading passages in that test do not include
the content areas they know.
Good programs select students by using several assessment tools, rather than
just one. Although the regulations do not explicitly state other requirements,
they do allow you to use additional assessment tools in selecting students. Ask
your state director how you can best use other assessment tools, such as report
card grades, results of other tests, and systematic teacher assessments obtained
Some common methods for using multiple assessments are:
o selecting students who score below prescribed cut-offs on both your
district's standardized test and another state-mandated test;
o using your district's standardized test to identify a pool of possible
participants, then using either a teacher-completed questionnaire or report card
grades to select students from the pool;
o using a systematic method for obtaining teachers' judgments about students'
needs in order to identify a pool of possible participants, then using a
standardized test to select students from the pool; or
o using the standardized test to identify a pool of students, then creating a
study team to select students from the pool and carefully documenting the study
DON'T GIVE OUT-OF-LEVEL TESTS
Out-of-level testing occurs
when you give a standardized test to students who are at a different grade level
than the one for which the test is designed. In some cases, school officials use
out-of-level tests in compensatory programs because those students are behind
their peers and in-level testing is frustrating for them. Administrators who
follow this practice believe that somehow it is more valid to give those
students tests designed for lower grade levels.
While out-of-level tests may be less frustrating to some students, the scores
obtained from them are also less valid because
o the content for out-of-level tests does not represent the content taught in
o the scale that test publishers use to link different test levels is loaded
o there are no norms for out-of-level tests,
o scores obtained on tests of different difficulty are not comparable, and
o when obtained, out-of-level scores appear to be too low.
Although in-level test scores are more reliable in the middle than at the
high- and low-score ranges, they are quite reliable in placing students at the
high or low end of the scale. For example, with a reasonable degree of
assurance, we can say that a student who scores at the 10th percentile rank is
most likely a low-achieving student. What we are less sure about is whether the
student is at the 10th percentile rank or the 15th percentile rank. Either way,
we are reasonable in concluding that the student is low achieving.
You should use tests at the grade levels for which they are specified by the
test publisher. Generally, the content of grade-level tests will represent what
is taught in regular classrooms at the specified level.
If your compensatory program is good, it will be closely coordinated with
instruction in the regular classroom. Since the purpose of compensatory
education is to help students succeed in the regular classroom, using in-level
tests will help you in the coordination.
UNDERSTAND THE TERM "GRADE-LEVEL"
Generally, when school
personnel say that certain students perform at grade-level, they mean that those
students can learn material at about the same rate and quality as others in the
same class. The implication is that students who don't perform at grade-level
have significantly more difficulty in class than their peers. Accordingly, when
students are labeled as working below grade-level, the implication is that they
may not have the aptitude, maturity, or interest to do the work that others in
the same class are doing. This interpretation of students' abilities is made by
relatively few people.
In contrast, in the testing arena at grade-level has a different meaning.
When students score at grade-level, their scores are at the 50th percentile
rank. It means that about half of their peers score higher and about half score
lower. In testing, at grade-level does not relate to how well students perform
in the classroom. Therefore, when you review students' scores, you must consider
that, by definition, many students score below grade-level.
Historically, the term grade-level has been important in the politics of
compensatory education. Proponents of compensatory education programs have
always said that those programs were underfunded because many students who
performed below grade-level did not receive program services. In this case,
performing below grade-level was defined as scoring below the 50th percentile
rank. While it is true that compensatory education may be underfunded and, I
believe, is an important part of schooling, it is inappropriate to use the term
grade-level in the true testing-related sense.
Since most people use the term grade-level in the general sense, you can
either avoid using grade-equivalent test scores or develop a range of scores
that indicate satisfactory achievement in the classroom. You may also think of
average performance on a test as being between the 23rd and the 77th percentile
DIFFERENTIATE THE DEGREE OF ERROR IN INDIVIDUAL AND GROUP
Administrators tend to interpret differences in test scores in one of
two ways. First, they may think that a difference of one or two percentile rank
points is an important difference. Secondly, they may think that a difference of
ten points shows that the test is unreliable. Few administrators can
differentiate the degree of error in individual and group scores.
An individual test score is just that -- the score that an individual student
receives on a test. A group score is the average of several individual scores.
For example, the average score of third graders at Horace Mann Elementary School
is a group score.
In general, individual scores have more error in them than group scores do.
The error in an individual score is largely a function of the test's standard
error that is described in the publisher's technical manual. For most of the
tests given in elementary and secondary schools, the standard error is about 2.5
raw score points. This means that about 95% of the time, we would expect the
scores for individual students to fall within a range of 10 raw score points.
That is not particularly reassuring, but it is exactly why we need to use
multiple measures for selecting students and why for most of the tests we use we
should be a little skeptical of individual test scores and cautious in
The error in group scores largely depends on the size of the group. Once you
have a group of about 30 scores, the magnitude of the errors decreases. By the
time you average all the scores for your school district, you can regard the
results as accurate as long as there is not some systematic bias operating for
most everyone in the district.
You can be confident of your interpretation when you consider score averages
of large groups. For instance, if when you consider a group of 55 scores, the
score average changes one or two percentile rank points, then that is an
important change. If you consider averages based on fewer cases, you must be
more cautious. You can be more or less confident of average scores depending on
the level. There is a definite hierarchy in the strength of your
interpretations. Your interpretations are most sure when you consider district
averages, followed in order by building averages, classroom averages, and
finally individual students' scores.