by Frary, Robert B.
ERIC Digest TM 95-3 (ED 398 236) gave a few suggestions for item-writing, but only to a limited extent, due to its coverage of other aspects of test development. What follows here is a fairly comprehensive list of recommendations for writing multiple choice items. Some of these are backed up by psychometric research; i.e., it has been found that, generally, the resulting scores are more accurate indicators of each student's knowledge when the recommendations are followed than when they are violated. Other recommendations result from logical deduction.
1. Do ask questions that require more than knowledge of facts. For example, a question might require selection of the best answer when all of the options contain elements of correctness. Such questions tend to be more difficult and discriminating than questions that merely ask for a fact. Justifying the "bestness" of the keyed option may be as challenging to the instructor as the item was to the students, but, after all, isn't challenging students and responding to their challenges a big part of what being a teacher is all about?
2. Don't offer superfluous information as an introduction to a question, for example, "The presence and association of the male seems to have profound effects on female physiology in domestic animals. Research has shown that in cattle presence of a bull has the following effect:" This approach probably represents an unconscious effort to continue teaching while testing and is not likely to be appreciated by the students, who would prefer direct questions and less to read. The stem just quoted could be condensed to "Research has shown that the presence of a bull has which of the following effects on cows?" (17 words versus 30).
3. Don't ask a question that begins, "Which of the following is true [or false]?" followed by a collection of unrelated options. Each test question should focus on some specific aspect of the course. Therefore, it's OK to use items that begin, "Which of the following is true [or false] concerning X?" followed by options all pertaining to X. However, this construction should be used sparingly if there is a tendency to resort to trivial reasons for falseness or an opposite tendency to offer options that are too obviously true. A few true-false questions (in among the multiple-choice questions) may forestall these problems. The options would be: "1) True 2) False".
4. Don't use items like the following"
What is (are) the capital(s) of Bolivia?
A. La Paz
1) A only
2) B only
3) C only
Research on this item type has consistently shown it to be easier and less discriminating than items with distinct options. In the example above, one only needs to remember that Bolivia has two capitals to be assured of answering correctly. This problem can be alleviated by offering all possible combinations of the three basic options, namely:
1) A only, 2) B only, 3) C only, 4) A and B, 5) A and
C, 6) B and C, 7) A, B, and C, 8) None of the above.
However, due to its complexity, initial use of this adaptation should be limited.
5. Do ask questions with varying numbers of options. There is no psychometric advantage to having a uniform number, especially if doing so results in options that are so implausible that no one or almost no one marks them. In fact, some valid and important questions demand only two or three options, e.g., "If drug X is administered, body temperature will probably: 1) increase, 2) stay about the same, 3) decrease."
6. Don't put negative options following a negative stem. Empirically (or statistically) such items may appear to perform adequately, but this is probably only because brighter students who naturally tend to get higher scores are also better able to cope with the logical complexity of a double negative.
7. Don't use "all of the above." Recognition of one wrong option eliminates "all of the above," and recognition of two right options identifies it as the answer, even if the other options are completely unknown to the student. Probably some instructors use items with "all of the above" as yet another way of extending their teaching into the test (see 2 above). It just seems so good to have the students affirm, say, all of the major causes of some phenomenon. With this approach, "all of the above" is the answer to almost every item containing it, and the students soon figure this out.
8. Do ask questions with "none of the above" as the final option, especially if the answer requires computation. Its use makes the question harder and more discriminating, because the uncertain student cannot focus on a set of options that must contain the answer. Of course, "none of the above" cannot be used if the question requires selection of the best answer and should not be used following a negative stem. Also, it is important that "none of the above" should be the answer to a reasonable proportion of the questions containing it.
9. Don't include superfluous information in the options. The reasons given for 8 above apply. In addition, as another manifestation of the desire to teach while testing, the additional information is likely to appear on the correct answer: "1) W, 2) X, 3) Y, because ...., 4) Z." Students are very sensitive to this tendency and take advantage of it.
10. Don't use specific determiners in distractors. Sometimes in a desperate effort to produce another, often unneeded, distractor (see 5 above), a statement is made incorrect by the inclusion of words like all or never, e.g., "All humans have 46 chromosomes." Students learn to classify such statements as distractors when otherwise ignorant.
11. Don't repeat wording from the stem in the correct option. Again, an ignorant student will take advantage of this practice.
ERRORS TO AVOID
Most violations of the recommendations given thus far should not be classified as outright errors, but, instead, perhaps, as lapses of judgment. And, as almost all rules have exceptions, there are probably circumstances where some of 1-11 above would not hold. However, there are three not-too-common item-writing/test-preparation errors that represent nothing less than negligence. They are now mentioned to encourage careful preparation and proofreading of tests:
Typos. These are more likely to appear in distractors than in the stem and the correct answer, which get more scrutiny from the test preparer. Students easily become aware of this tendency if it is present.
Grammatical inconsistency between stem and options. Almost always, the stem and the correct answer are grammatically consistent, but distractors, often produced as afterthoughts, may not mesh properly with the stem. Again, students quickly learn to take advantage of this foible.
Overlapping distractors. For example: "Due to budget cutbacks, the university library now subscribes to fewer than (?) periodicals. 1) 25,000 2) 20,000 3) 15,000 4) 10,000"
Perhaps surprisingly, not all students "catch on" to items like this, but many do. Worse yet, the instructor might indicate option 2 as the correct answer.
Finally, we consider an item-writing foible reported by Smith (1982). What option would you select among the following (stem omitted)?
1) Abraham Lincoln 3) Stephen A. Douglas
2) Robert E. Lee
The testwise but ignorant student will select Lincoln because it represents the intersection of two categories of prominent nineteenth century people, namely, presidents and men associated with the Civil War. Try this one:
1) before breakfast 3) on a full stomach
2) with meals
Three options have to do with eating, and two with the time of day. Only one relates to both. Unfortunately, some item writers consciously or unconsciously construct items of this type with the intersection invariably the correct answer.
This Digest was adapted with permission from "Testing Memo 10: Some Multiple-choice Item Writing Do's And Don'ts," Office of Measurement and Research Services, Virginia Polytechnic Institute and State University, Blacksburg, VA 24060
Airasian, P. (1994). "Classroom Assessment," Second Edition, NY" McGraw-Hill.
Brown, F. (1983). "Principles of Educational and Psychological Testing," Third edition, NY: Holt, Rinehart & Winston. Chapter 11.
Cangelosi, J. (1990). "Designing Tests for Evaluating Student Achievement." NY: Longman.
Grunlund, N. (1993). "How to make achievement tests and assessments," 5th edition, MA: Allyn and Bacon.
Haladyna, T.M. & Downing, S.M. (1989). Validity of a Taxonomy of Multiple-Choice Item-Writing Rules. "Applied Measurement in Education," 2 (1), 51-78.
Roid, G.H. & Haladyna, T.M. (1980). The emergence of an item writing technology. "Review of Educational Research," 49, 252-279.
Smith, J.K. (1982). Converging on correct answers: A peculiarity of multiple-choice items. "Journal of Educational Measurement," 19, 211-220.
Wesman, A.G. (1971). Writing the test item. In R.L. Thorndike (Ed.) "Educational Measurement" (1st ed, pp 99-111). Washington, DC: American Council on Education.