ERIC Identifier: ED328610
Publication Date: 1990-12-00
Author: Childs, Ruth Axman
Source: ERIC Clearinghouse on Tests Measurement and Evaluation
Washington DC., American Institutes for Research Washington DC.
Gender Bias and Fairness. ERIC Digest.
Because test results are often the basis for decisions that affect students'
educational futures, tests should provide equal opportunities for all students
to demonstrate their abilities and knowledge. The issues of gender bias
and fairness in testing are concerned with differences in opportunities
for men and women.
This Digest provides a brief introduction to this complex topic. Commonly
accepted definitions of gender bias and gender fairness are discussed.
Approaches used to detect gender bias and fairness are introduced. Being
aware of gender bias and fairness in testing will prepare you to intelligently
question the uses of the test results which form the basis of decisions
about individual test taker's futures.
WHAT IS GENDER BIAS?
A test is biased if men and women with the same ability levels tend
to obtain different scores. The conditions under which a test is administered,
the wording of individual items, and even a student's attitude toward the
test will affect test results. These factors may change with time as tests
are administered differently, as items are revised, and as students feel
more or less comfortable taking the test. The error caused by these factors
will randomly affect both men and women.
Another type of error is caused by factors which do not change. Known
as systematic error, it is the result of characteristics of the examinees
that are stable (such as gender or race) and that are characteristics other
than those the test is intended to measure. Gender bias in testing is often
the result of such systematic error.
WHAT IS GENDER FAIRNESS?
While bias is a characteristic of the test itself, fairness refers to
the ways test results are used. Even an unbiased test may be used in ways
that give an advantage to members of one gender. For example, a testing
policy may treat test results differently for men and women.
DETECTING GENDER BIAS IN TESTING
Test publishers go to great lengths to make sure that the questions
contained in their tests are not biased and that the recommended uses of
the tests are not likely to be unfair to members of one gender. However,
no test is perfect. Careful examination of a test's questions may alert
the educator to possible bias.
Test questions may be checked for:
* material or references that may be offensive to members of one gender,
* references to objects and ideas that are likely to be more familiar
to men or to women,
* unequal representation of men and women as actors in test items or
representation of members of each gender only in stereotyped roles.
If the questions involve objects and ideas that are more familiar or
less offensive to members of one gender, then the test may be easier for
individuals of that gender. Standards for achievement on such a test may
be unfair to individuals of the gender that is less familiar with or more
offended by the objects and ideas discussed, because it may be more difficult
for such individuals to demonstrate their abilities or their knowledge
of the material.
Although examination of the test items may reveal that the test contains
questions which have the potential to yield biased results, such an examination
may not be sufficient to determine bias. Statistical techniques are often
used to examine for systematic gender differences.
GENDER BIAS IN APTITUDE TESTS
Aptitude tests predict a future outcome. College entrance tests, for
example, are usually administered to students during their senior year
of high school and are intended to predict their grades during their first
year of college. It would not be appropriate to check for bias in a college
entrance test by just comparing average scores for men and women. You must
also consider college grades.
Determining whether such a test is biased involves using statistical
techniques to calculate the predictive relationship separately for each
gender. If the relationships are the same for men and women, we can say
with confidence that the test predicts equally well for both genders.
These techniques have been used in recent studies concerning gender
bias in college entrance tests. Several studies (such as those reported
by Rosser in The SAT Gender Gap) have found that, while women tend to earn
lower scores than men on some college entrance tests, they tend to have
higher grade point averages during their first year of college. While inconclusive,
these studies suggest that either 1) the predictive relationship between
test scores and freshman GPAs are not be the same for both genders or 2)
there is a systematic bias in the assignment of college grades.
GENDER BIAS IN ACHIEVEMENT TESTS
While aptitude tests predict future academic success, achievement tests
measure an individual's knowledge. If it were possible to determine the
exact amount of knowledge an individual possesses about a given topic,
then establishing whether a test was biased would involve comparing the
test scores with the actual knowledge levels and checking for systematic
differences in these comparisons for men and women. Since it is not possible
to obtain a perfectly accurate measure of an individual's level of knowledge,
achievement test scores are usually compared with teacher-assigned grades
and with scores from other tests. If one gender's scores on the test being
considered are higher or lower than the scores of the other gender, but
their grades in the classroom or their scores on similar tests are comparable
for both genders, it may be that the test is biased.
DETECTING LACK OF FAIRNESS IN TESTING
Determining whether test results are being used fairly requires the
examination of the organizational policies that determine how the test
results are used.
In a recent case, Sharif v. New York State Education Department, the
plaintiffs charged that by using SAT scores as the sole basis for the award
of state merit scholarships, the New York State Education Department was
discriminating against girls who were competing for the awards. Although
the girls tended to have higher high school grades than the boys competing
for the scholarships, they also tended to have lower scores on the SAT,
and so received fewer of the scholarships. The State Education Department
argued that the SAT was the best objective measure available.
The SAT is intended to predict students' grades during the first year
of college and does not claim to measure the achievement of students during
high school. The stated intention of the New York scholarship program,
however, was to base its awards on high school achievement. Since the program
based its awards solely on the results of a test, the plaintiffs argued
that the process denied girls a fair opportunity to demonstrate their eligibility
for the awards. The court agreed and ruled that New York could no longer
use SAT scores alone as a basis for these scholarship awards.
Not all educational policies have as easily measured and clearly different
impacts on the genders as the policy described above. Considering whether
such policies are supported by sound educational theory or research may
be helpful in detecting possible difficulties.
UNDERSTANDING GENDER BIAS AND LACK OF FAIRNESS IN TESTING
The types of gender bias and lack of fairness described above may not
affect all individuals equally. For example, although most girls have less
experience than most boys with activities that help to develop mathematical
skills, there are many girls who do not fit this generalization. Similarly,
although most boys perform better than most girls on mathematics tests,
some boys perform poorly on such tests.
Campbell, P.B. (1989) The Hidden Discriminator: Sex and Race Bias in
Educational Research. Groton, MA: Women's Educational Equity Act Program.
ERIC Document Reproduction Service No. ED 322 174.
Klein, S.S. (Ed.) (1985) Handbook for Achieving Sex Equity through Education.
Baltimore, MD: Johns Hopkins University Press. ERIC Document Reproduction
Service No. ED 290 810.
Rosser, P. (1989) The SAT Gender Gap: Identifying the Causes. Washington,
DC: Center for Women Policy Studies. ERIC Document Reproduction Service
No. ED 311 087.
Tittle, C.K. (1979) What to Do About Sex Bias in Testing. Princeton,
NJ: ERIC Clearinghouse on Tests, Measurement, and Evaluation. ERIC Document
Reproduction Service No. ED 183 628.
Sharif v. New York State Education Department, 709 F.Supp. 345 (S.D.N.Y.
Please note that this site is privately owned and is in no way related
to any Federal agency or ERIC unit. Further, this site is using a
privately owned and located server. This is NOT a government sponsored
or government sanctioned site. ERIC is a Service Mark of the U.S. Government.
This site exists to provide the text of the public domain ERIC Documents
previously produced by ERIC. No new content will ever appear here
that would in any way challenge the ERIC Service Mark of the U.S. Government.