ERIC Identifier: ED465544
Publication Date: 2001-12-00
Author: Haury, David L.
Source: ERIC Clearinghouse for
Science Mathematics and Environmental Education Columbus OH.
The State of State Proficiency Testing in Science. ERIC Digest.
Schools across the United States are striving to improve student performance
in science by adjusting curricula and teaching practices to meet national and
state standards. "Standards-based reform" is the rallying cry for these efforts
to enliven the "National Science Education Standards" (NSES: National Research
Council, 1996). Ongoing reform in science education has intensified in response
to the results of widely reported national and international studies of student
understanding. Despite rapid advancements in science and technology within the
nation, most U.S. school students have not performed all that well on tests of
scientific knowledge and understanding.
The most recent results in science from the National Assessment of
Educational Progress show no statistically significant changes in average
student scores at grades 4 or 8 since 1996, but the average scores for students
in grade 12 have declined (See http://nces.ed.gov/nationsreportcard/
science/results/). Results from the Third International Mathematics and Science
Study (TIMSS) were even more jarring. Though results across the states were
highly variable, U.S. students overall achieved mediocre scores compared to the
students of other developed nations (U.S. National TIMSS site:
http://ustimss.msu.edu/; International TIMSS site: http://timss.bc.edu/). After
years of ongoing science education reform, U.S. schools are now beginning to be
held accountable for higher levels of performance among students.
THE MOVE TO HIGH STAKES TESTING
One prominent new strategy
for ensuring accountability and higher performance among students has come to be
known as "high-stakes" testing, the use of test scores to determine which
students will graduate or which will be promoted from one grade to the next. In
some cases the stakes may also include decisions about which teachers will get
salary bonuses, or which schools will get extra funds to support academic
improvements. This rapidly spreading practice was once described as "the latest
silver bullet designed to cure all that ails public education" (Kunen, 1997).
But is it a bullet that cures, or does it kill? Does high-stakes accountability
testing support standards-based reform efforts, or hinder them?
While proponents see high-stakes testing as a means of holding schools,
teachers, and students to high standards, some view testing as being
inconsistent with the stated goals of the NSES (Huber & Moore, 2000).
Indeed, the NSES (pp. 52, 72, 113, & 239) call for less emphasis on external
assessments and standardized tests unrelated to "Standards"-based programs and
Response to standardized tests by the general public seems mixed. According
to the most recent Phi Delta Kappa/Gallup Poll. (Available online at:
http://www.pdkintl.org/kappan/k0109gal.htm). Of those polled, 44% thought there
was just the right amount of emphasis on standardized testing, but 51% of public
school parents opposed "using a single standardized test --to determine whether
a student should be promoted from grade to grade." Interestingly, only 45% of
public school parents opposed "using a single standardized test --to determine
whether a student should receive a high school diploma."
Stronger support is provided by a survey sponsored by The Business Roundtable
(Available online at: http://www.brtable.org/press.cfm/453). Indicating that 65%
of parents and 70% of the general public support a policy of requiring students
to "pass statewide tests before they can graduate from high school, even if they
have passing grades in their classes." This is viewed as good news for the
business community that has supported the push for rigorous education standards
for some time.
UNINTENDED OUTCOMES OF HIGH-STAKES TESTING
broad-based support for high-stakes testing, there is organized opposition
(Schrag, 2000). Complaints: range from concerns that the testing is "killing"
innovative teaching and driving out good teachers to claims that tests
overstress young students and are unfair to poor and minority students and
others who lack test-taking skills. Others say that such tests limit the
curriculum and "snuff out both creative teaching and the joy of learning" (Blair
& Archer, 2001).
At a more fundamental level, questions about the validity of high-stakes
tests and the ways they are being used and interpreted threaten to undermine the
entire standards-based reform movement (Domenech, 2000). Objectivity and
"teaching to the tests" are real concerns. In addition to narrowing the focus of
instruction and assessment, there is an added risk of overburdening students and
teachers through practices that may lead to inappropriate inferences about
student performance (Ananda & Rabinowitz, 2000).
Finally, some claim that high-stakes testing creates a system that is unfair
and destructive to learning, and that tougher standards and standardized testing
are uniquely harmful to low-income and minority students (Kohn, 2000). While
high-stakes testing may raise the level of education overall and raise the level
of success by some students after graduation, the tests will exacerbate the
problems of those already at risk or struggling to overcome disadvantaged
backgrounds (Orfield & Kornhaber, 2001).
STATUS OF TESTING IN SCIENCE
During Fall, 2001, the Council
of Chief State School Officers (CCSSO) published the "1999-2000 Annual Survey of
State Student Assessment Programs" (See
http://publications.ccsso.org/ccsso/publication_detail.cfm?PID=350). Of states
surveyed, 39 reported some form of proficiency testing in science being included
in the state testing program. The results of state testing programs were used in
making decisions about student promotion or retention in nine states, and
passing scores were required for graduation in 17 states. Test results were
included in reports of school performance in 37 states, and test results were
used in making school improvement plans in 30 states. In only six states were
test results used for staff accountability purposes, with four states using
results as a basis for monetary rewards, such as bonuses.
The impact of one state testing program has been closely examined (Huber
& Moore, 2000), and evidence indicates that the highly publicized, model
program has "derailed efforts to implement standards-based reforms" in science.
Though high-stakes testing programs and the NSES appear to be at cross-purposes
in several regards, two areas are of particular concern: equity and excellence.
With regard to equity issues, the testing program accentuates well-documented
barriers to learning science among selected groups of students. In addition to
evidence that the tests are biased (see Huber & Moore, 2000), they provide
the basis for sanctions against the low-performing schools that are in need of
most help in develop locally relevant programs.
Even if equity issues were adequately resolved, there remains a fundamental
clash between high-stakes testing and the central features of the NSES. The NSES
place great importance on learning through inquiry, de-emphasizing science as a
body of factual knowledge to focus on science as a way of knowing. It is hoped
that students will learn how to frame questions and use inquiry to find answers,
investigating real problems. High-stakes standardized testing has the opposite
thrust, focusing on a broad body of factual knowledge. May have claimed that
this emphasis will pressure teachers to "teach to the test" and focus on
particular subjects, and that appears to be happening. In a survey of teachers
(Jones, Jones, Hardin, Chapman, Yarbrough, & Davis, 1999), 80% of
participating teachers reported spending over 21% of their instructional time
practicing for End-of-Grade tests, with over 28% of the teachers spending from
61% to 100% of their instructional time practicing for the tests.
It has been pointed out that assessment must be
aligned with curriculum and instruction to support learning (Pellegrino,
Chudowsky, and Glaser, 2001), so this is an issue that needs much attention as
the practice of high-stakes testing spreads. Webb (1999) has described the
development of new procedures for determining the degree of alignment of science
and mathematics standards with assessment. Three states volunteered to have
their science standards and assessments analyzed for two or three grade levels,
and the results of analysis are highly variable. Four criteria were used in
measuring the degree of alignment:
Categorical Coherence-the extent to which the categories of content appear in
both standards and assessment documents.
Depth-of Knowledge Consistency-the extent to which the cognitive demand of tests
reflects what students are expected to know.
Range-of Knowledge Correspondence-the extend to which the span of knowledge
required on the assessment matches the span of knowledge expected of students.
Balance of representation-the extent to which test items are evenly distributed
Though the results of this case study are not generalizable beyond the
participating states, it is interesting to note the pattern of correspondence
between science standards and assessments across the criteria. Though there was
judged to be 100% alignment in terms of "Balance of representation," there was
little "Range-of Knowledge Correspondence" (0% to 33%). Though somewhat better,
the "Categorical Coherence" (38% to 67%) and "Depth-of Knowledge Consistency"
(25% to 83%), ranged from poorly to highly aligned among individual states.
The most important outcome of the study is the emergence of a process to
judge the alignment between science standards and assessments, and more states
much carefully consider this issue. The CCSSO has developed a research tool base
on these results, the Surveys of Enacted Curriculum (SEC), that provides a
practical, efficient means of obtaining consistent data on mathematics and
science education practices through teacher reports. This approach enables
schools, districts, or states to analyze current classroom practices in relation
to content standards and facilitate program evaluations, curriculum
improvements, interpretation of student assessment results, and alignment of
curricula with standards (See http://www.ccsso.org/sec.html). It is imperative
that states basing important decisions about students, teachers, and schools on
high-stakes tests begin using or developing tools like this. States must quickly
begin a process of alignment between standards and assessment so that "teaching
to the test" becomes "teaching to the standards" in science.
Ananda,S. & Rabinowitz, S. (2000). "The High
Stakes of HIGH-STAKES Testing" (Policy Brief). San Francisco, CA: WestEd.
Blair, J., & Archer, J. (2001, July 11). NEA members denounce high-stakes
testing. "Education Week," 20 (42), Web-only at
Domenech,D. A. (2000, December). My Stakes Well Done. "School Administrator,"
57 (11), 16-19.
Huber, R. A. & Moore, C. J. (2000) Educational reform through high stakes
testing-Don't go there. "Science Educator," 9 (1), 7-13.
Jones, B. D., Jones, G. M., Hardin, B., Chapman, L., Yarbrough, T., &
Davis, M. (1999, November). The impact of high-stakes testing on teachers and
students. "Phi Delta Kappan," 199-203.
Kohn,A. (2000, September-October). Burnt at the High Stakes. "Journal of
Teacher Education," 51 (4), 315-27.
Kunen, J. S. (1997, June 16). The test of their lives. "Time," 149 (24),
Miller,D.W. (2001, March). Scholars Say High-Stakes Tests Deserve a Failing
Grade. "Chronicle-of Higher Education," 47 (25), A14-A16.
National Research Council. (1996). "National science education standards."
Washington, D C: National Academy Press. (Available online at:
Orfield, G. & Kornhaber, M. L. (Eds.). (2001). "Raising standards or
raising barriers?" Washington, DC: The Century Foundation Press.
Pellegrino, J. W., Chudowsky, N., & Glaser, R. (Eds.). (2001). "Knowing
what students know: The science and design of educational assessment."
Washington, DC: National Academy Press. (Available online at:
Schrag, P. (2000, August). "High stakes are for tomatoes." "The Atlantic
Monthly," 286 (2); 19-21. (Available online at:
Webb, N. L. (1999). "Alignment of science and mathematics standards and
assessments in four states" (Research Monograph No. 18). Washington, DC: Council
of Chief State School Officers).