|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
|
|
ERIC Identifier: ED447147 Publication Date: 2000-09-00 Author: Russell, Michael Source: ERIC Clearinghouse on Assessment and Evaluation College Park MD. Using Expected Growth Size Estimates To Summarize Test Score Changes. ERIC/AE Digest.An earlier digest described the shortcomings of three methods commonly used to summarize changes in test scores (Russell, 2000). This article describes two less commonly used approaches for examining change in test scores, namely Standardized Growth Estimates and Effect Sizes. Aspects of these two approaches are combined and applied to the Iowa Test of Basic Skills (ITBS) to demonstrate the utility of using a third method, termed Expected Growth Size, to examine change in test scores. STANDARDIZED GROWTH ESTIMATESStenner, Hunter, Bland, &
Cooper describe a standardized growth expectation (SGE) as "the amount of growth
(expressed in standard deviation form) that a student must demonstrate over a
given treatment interval to maintain his/her relative standing in the norm
group" (1978,p. 1). To determine an SGE, Stenner et. al. proposed the following
three-step method.
Step 1. The scale score associated with the 50th percentile for a given grade level or the pre-test is identified. Step 2. The percentile rank for the following grade level or the post-test associated with this scale score is found. Step 3. The difference between the 50th percentile and the post-test percentile is calculated. To determine this difference, a unit normal deviate table is used to convert percentiles to z-scores and the z-score for the post-test is subtracted from the z-score for the pre-test. The difference between the pre-test and post-test z-scores is the SGE and expresses "the amount of loss in relative standing that such a student would suffer if he/she learned nothing during the time period" (Stenner, et. al., 1977, p. 1). As an example, to determine the SGE for grade 3, Table 1 indicates that the scale score associated with the 50th percentile for grade 3 on the ITBS Language sub-test is 174. The percentile rank for grade 4 that corresponds to a scale score of 174 is 26. If a student received the same scale score in grades 3 and 4, their percentile rank would drop from 50 to 26. After both percentiles are converted to z-scores and subtracted, the difference between the two z-scores represents the SGE. In this case, the z-scores corresponding to percentile ranks of 50 and 26 are 0 and -.64, respectively. Thus, the SGE is .64, which indicates a relative loss of .64 standard deviations for a student who shows no change in his/her test score. EFFECT SIZESWhen applying Stenner et. al.'s method for
calculating SGEs, Haney, Madaus and Lyons (1993, p. 231-32) point out that the
idea of a SGE is analogous to an effect size in that each represents the
difference in mean performance of two groups expressed in standard scores. As
Glass, McGaw and Smith (1981) describe, an effect size represents the difference
between two groups in standard deviations. To calculate an effect size, the
difference between the mean of the control group and the experimental group is
divided by the standard deviation of the control group. Conceptually, the only
difference between an effect size and an SGE is that an effect size is used to
compare the means of a "control" group and an "experimental" group while a SGE
compares the performance of groups of students at various grade levels.
-----See TABLE 1 at end of digest.
-----
EXPECTED GROWTH SIZE Although an SGE and an effect size are similar,
there is one important difference: an SGE focuses on the standing lost when
there is no change in test score, while the effect size focuses on the amount of
change in a test score necessary to maintain one's standing. When applied in
this manner, the effect size method provides an estimate of the expected growth
size between two time periods. In the example above, the expected growth size
(EGS) between grade three and grade four on the ITBS Composite Language test is
.89 standard deviations.
DEFINING THE BASE YEAR OR CONTROL GROUPIn a well-designed
experiment, there is little question as to which group is defined as the control
group and which is the experimental group. However, when applying the concept of
an effect size to change in test scores between two grade levels, one could
reference growth to the pre-test or the post-test distribution.
In the case of SGEs, the post-test distribution is used to reference "growth". Note, however, that although SGEs employ the term growth, the methodology actual provides a measure of loss assuming that a student experiences no growth whatsoever. In this way, using the post-test distribution to reference "growth" is fundamentally flawed in that change is placed in the context of where a student is expected to be rather than from where they started. The situation is analogous to describing someone's progress on trip in relation to how far they still must go in order to reach their destination rather than from how far they have traveled since their departure. In the case of using an effect size to express growth between two grade levels, one might argue that the pooled standard deviation be employed in lieu of the standard deviation of the control group. However, the difficulty of obtaining an estimate of the pooled standard deviation for most standardized tests forces a choice between designating the pre-test or the post-test as the control group. Given the desire to measure change or growth from where a group begins at one point in time to where they end at a second point in time, the EGS methodology references change to the pre-test distribution. For this reason, the pre-test distribution is assigned as the control group. ADVANTAGES OF AN EXPECTED GROWTH SIZEAlthough an expected
growth size is more difficult to calculate, it offers three advantages. First,
by expressing change in relation to the standard deviation, growth rates for
different tests and different grade levels can be compared directly. Table 2
presents expected growth sizes for grades 1 through 8 for several portions of
the ITBS. Examining Table 2, one can see that the expected growth sizes differ
for each portion of the ITBS. Table 2 also shows an inverse relationship between
grade level and size of expected growth. As the grade level increases, the
amount of growth students experience decreases.
-----See TABLE 2 at end of digest.
-----Similarly, within each grade level, the amount of
growth students experience varies by percentile ranks. Students scoring at the
25th percentile experience less growth than students scoring at the mean. And
students scoring at the mean experience less growth than students scoring at the
75th percentile. This pattern explains why the standard deviation for most
standardized tests increases as the grade level progresses.
Second, once expected growth sizes are calculated for a given test, they can be easily transformed to more common measurement scales. As an example, multiplying the expected growth size by the standard deviation of an Normal Curve Equivalent, NCE, (21.06) provides the number of NCE points a student's score increases during a given time period relative to the student's initial norm group when s/he maintains his/her current standing. For the ITBS Language test, the score for a student who maintains a 50th percentile ranking increases 18.74 NCEs between the third and fourth grade. Third, once expected growth sizes are transformed to an NCE scale, changes in an individual's or a group's mean score can be reported in relation to expected growth. Performance on most standardized tests is reported relative to the Norm Group for a student's current grade. If the student grows at the same rate as other students in the Norm Group, his/her percentile rank and NCE will remain the same across two years. However, if the student's rate of growth differs from that of the Norm Group, his/her NCE and percentile rank will change. The expected growth size can be used to determine the extent to which the student's growth exceeded or fell short of the expected growth size. To do so, the student's current NCE is subtracted from his/her previous NCE and divided by the expected NCE growth rate. As an example, consider a student whose NCE for the ITBS Language test increased from 50 in grade 3 to 55 in grade 4. When divided by the expected NCE growth size for third grade (18.74), this five point increase represents 1.27 years of growth. Thus, the student's score increased 27% more than expected. As Table 2 indicates, growth sizes vary across grade levels. Expressing change in test scores in relation to expected growth size takes these differences in growth rates into consideration. The extent to which performance changes is placed in the context of how scores generally change for students in a given grade. As a result, a more accurate measure of how a student changes relative to other students in his/her grade is produced. As an example, Table 2 shows that students in grade 2 experience about twice as much growth in their test scores compared to students in grade 5. For this reason, an increase of 5 NCEs on the ITBS Composite Math test represents larger growth relative to expected growth for a student in grade 5 than for a student in grade 2. LIMITATIONS OF EXPECTED GROWTH SIZESAlthough expected
growth sizes provided a sounder approach for summarizing change in test scores
than some of the more commonly used approaches, their use is limited to norm
referenced standardized tests. Moreover, the EGS methodology assumes that the
tests have been vertically equated. When comparing change across multiple years,
the methodology also assumes that the tests administered each year provide
measures of the same construct based on identical content. Although most
norm-referenced tests attempt to meet both assumptions vertical equating and
measures of the same construct the extent to which they fail to meet these
assumptions impacts the accuracy of estimates yielded by the EGS methodology.
Finally, as with all comparisons of change over time, the EGS method is also
limited by the reliability of the scores used to calculate change. Although
there is considerable debate over the extent to which low score reliability
impacts the meaningfulness of change scores, caution is advised when employing
the EGS method for tests with low reliability (see Willet, 1988 for fuller
discussion on reliability and change scores).
USING EXPECTED GROWTH SIZES FOR YOUR STUDENTSTo apply
expected growth sizes to examine change in the performance of your students,
readers are encouraged to use the attached spreadsheet. The spreadsheet provides
an easy-to-use template that allows users to calculate expected growth sizes for
most standardized tests. In addition, the spreadsheet translates expected growth
sizes into expected changes in NCE scores for each grade level.
As the attached instructions indicate, two pieces of information are required to use the spreadsheet: 1. Standard Score to Percentile Rank Conversion tables for the standardized test; and 2. The standard deviation for the standard score for each grade level. This information is available in the Technical Report(s) for each standardized test. Although expected growth sizes are more complicated to calculate, they provide a more accurate and comparable method of examining change in test scores within and across grade levels and on different tests. REFERENCESGlass, G., McGaw, B. & Smith, M. L. (1981).
Meta-analysis in Social Research. Beverly Hills: Sage.
Haney, W., Madaus, G., & Lyons, R. (1993). The Fractured Marketplace for Standardized Testing. Boston, MA: Kluwer Academic Publishers. Russell, M. (2000). Summarizing change in test scores: shortcomings of three common methods. ERIC Digest Series. Also in Practical Assessment, Research and Evaluation, 7(5). [Available online: http://ericae.net/pare/getvn.asp?v=7&n=5 ]. Stenner, A. J., Hunter, E. L., Bland, J. D., & Cooper, M. L. (1978). The standardized growth expectation: Implications for educational evaluation. Paper presented at the Annual Conference of the American Educational Research Association, Toronto, Canada. (ERIC Document Reproduction Service Number ED 169 072.) Willett, J. (1988). Questions and answers in the measurement of change. In E. Z. Rothkopf (Ed.), Review of Research in Education 15 (pp. 345-422). Washington, DC: American Educational Research Association. This Digest is based on Russell, Michael (2000). Using Expected Growth Size Estimates to Summarize Test Score Changes. Practical Assessment, Research & Evaluation, 7(6). Available online: http://ericae.net/pare/getvn.asp?v=7&n=6.
|
|
|||||||||||||||||||||||||||||||||||||||||||||||
Please note that this site is privately owned and is in no way related to any Federal agency or ERIC unit. Further, this site is using a privately owned and located server. This is NOT a government sponsored or government sanctioned site. ERIC is a Service Mark of the U.S. Government. This site exists to provide the text of the public domain ERIC Documents previously produced by ERIC. No new content will ever appear here that would in any way challenge the ERIC Service Mark of the U.S. Government.