Simulated Oral Proficiency Interviews: An Update. ERIC Digest. 

by Stansfield, Charles W. - Kenyon, Dorry 

The Simulated Oral Proficiency Interview (SOPI) is a performance-based speaking test that emulates the Oral Proficiency Interview (OPI) as closely as is practical in a tape-recorded format. The face-to-face OPI is used by government agencies belonging to the Interagency Language Roundtable (ILR) and by the American Council on the Teaching of Foreign Languages (ACTFL) to assess general speaking proficiency in a second language (Liskin-Gasparro, 1987).

As a semi-direct test, the SOPI elicits speech by means of a tape recording and printed test booklet. A semi-direct test can employ a variety of item formats. These may include techniques such as spoken pattern practice in response to cues in the test booklet or on tape, reading aloud, sentence repetition, sentence completion, naming nouns or verbs depicted through line drawings in the test booklet, describing a single picture, or describing a picture sequence (Clark & Swinton, 1979). However, many of these semi-direct elicitation techniques are inherently different from the relatively authentic, context-based techniques that are found in the SOPI.

Although the SOPI format is adaptable, the prototypical SOPI, developed by the Center for Applied Linguistics (CAL), consists of several parts. The test begins with simple personal background questions posed on the tape in a simulated initial encounter with a native speaker of the target language. During a brief pause, the examinee records a short answer to each question. This is analogous to the "warm-up" phase of the OPI and is designed to ease examinees into the testing format. The rest of the test contains performance-based tasks designed to elicit language similar to that elicited during the "level check" and "probe" phases of the OPI. These tasks assess the examinee's ability to perform the various functions that characterize the Intermediate, Advanced, and Superior levels of the "ACTFL Proficiency Guidelines" (for information on the "ACTFL Guidelines," see Stansfield, 1992). Picture-based tasks may require examinees to demonstrate the ability to ask questions about pictures; give directions to someone using a map; describe a particular place based on a drawing; or narrate a sequence of events in the present, past, or future using drawings in the test booklet as a guide. Other tasks on the test require examinees to speak on selected topics and perform in real-life situations. These topic- and situation-based tasks assess the examinee's ability to handle the functions and content that characterize the higher levels of the "ACTFL Guidelines" and may include speaking functions such as stating advantages and disadvantages, supporting an opinion, apologizing, or giving an informal talk. For lower level examinees, whose ability would be greatly challenged by the demands of such tasks, the test may be appropriately ended midway.

The SOPI consists of a master test tape, which contains all test instructions and test items, and an examinee response tape, which is used to record the student's responses. Tapes are accompanied by a test booklet, which contains the test instructions and test tasks. Directions to all tasks are presented in both the test booklet and on the test tape in English. These directions contain a description of the context of the speaking task, including who the examinee is addressing, what the situation is, why the speaking task needs to be performed, and any other relevant information to make the task as authentic as possible. After reading and hearing these directions, examinees are given a brief pause to organize their thoughts. Next, a native speaker of the target language makes a statement or asks a question appropriate to the situation described in the English directions. The examinee attempts to perform the indicated task by responding to the native speaker in a rejoinder natural for the situation. The prototypical SOPI may end with a wind-down consisting of easy questions in the target language that aim to put the examinee at ease. After the SOPI is completed, the examinee response tape is scored by trained raters who apply the criteria of the "ACTFL Guidelines." Scores may range from the Novice level to Superior.


In five studies involving different test development teams and different languages, the SOPI proved a valid and reliable surrogate to the OPI. Clark and Li (1986) developed four forms of a SOPI in Chinese. Once developed, the four forms of the test were administered, together with an OPI, to 32 students of Chinese at two universities. Each test was scored by two raters and the scores on the two test types were statistically compared. The results showed the correlation between the SOPI and the OPI to be .93.

Stansfield et al. (1990) reported on the development of three forms of a SOPI in Portuguese. This test and an OPI were administered to 30 adults at four institutions. Each test was scored by two raters. In this study, a correlation of .93 between the two test types was also found. In addition, the SOPI proved to be slightly more reliable and easier to rate than the OPI.

Shohamy, Gordon, Kenyon, & Stansfield (1989) reported on a CAL/University of Tel Aviv project that developed and validated the "Hebrew Speaking Test." Two forms of this SOPI were developed for use at Hebrew language schools for immigrants to Israel, and two forms were developed for use in North America. The first two forms were administered to 20 foreign students at the University of Tel Aviv and the other two forms were administered to 20 students of Hebrew at two U.S. universities. Each group also participated in an OPI. The correlation between the OPI and the Israeli version of the SOPI was .89, while the correlation for the U.S. version was .94.

Subsequently, Stansfield and Kenyon reported on the development and validation of SOPIs in Indonesian (1992) and Hausa (1993). In the Indonesian study, the correlation with the OPI for 16 adult learners was .95. Because no ACTFL or ILR-certified interviewers were available for Hausa, it was not possible to administer an OPI to subjects who took the "Hausa Speaking Test." However, two Hausa speakers were trained in the ACTFL scale and subsequently used it to score the test tapes. Raters showed high interrater reliability (.91) in scoring the test and indicated they believed it elicited an adequate sample of language from which to assign a rating.


Additional SOPIs for Arabic, Japanese, French, German, and Spanish have been developed. As these tests have been operationalized, the need for trained raters to score them has been addressed through live rater training workshops and the development of self-instructional rater training kits. Rater training kits are available to language instructors who would like to administer and rate the SOPI themselves. Rater training kits have been developed for Spanish, French, German, Japanese, and Chinese, with plans for a kit in Arabic. For each language, the "Rater Training Kit" consists of a manual, a workbook, and a reference guide for scoring; three training cassette tapes; and the SOPI testing materials. Research on the self-instructional rater training kits suggests they are an effective way to acquire rating skills without participating in live rater training (Kenyon & Stansfield, 1993).


Because the SOPI format is flexible, it can be tailored for desired levels of examinee proficiency and for specific examinee age groups, backgrounds, and professions. For several of the SOPIs developed by CAL, a lower level version of the test can be created by administering only the first part. Such a version is suitable for rating proficiency from the Novice-Mid to Advanced levels.

The SOPI format has been used by various institutions in the development of tests to meet their specific needs. For example, the University of Minnesota and the Minnesota Department of Education developed a SOPI in which seven tasks are combined to form one integrated story line. As this test is designed to focus on examinees who are at the Novice-High and Intermediate-Low levels, it consists solely of Intermediate-level tasks.

Another SOPI with a specific focus is the Texas Oral Proficiency Test (TOPT) (Stansfield & Kenyon, 1991), developed by CAL. A score of Advanced on the TOPT is required of all teachers seeking certification in Texas in Spanish, French, or bilingual education. This full-length test, consisting of 15 tasks, is taken by examinees who are generally at the Intermediate-Mid level or higher. Practice tests are available for the French and Spanish TOPT.


Any teacher, aide, or language lab technician can administer the SOPI. This may be especially useful in locations where a trained interviewer is not available. In addition, the SOPI can be administered simultaneously to a group of examinees by a single administrator, whereas a live interview must be administered individually. Thus, the SOPI may be preferable when many examinees need to be tested within a short span of time.

The SOPI may also offer psychometric advantages in terms of validity and reliability, particularly when there is a need to ensure a standardized testing procedure. The SOPI offers the same quality of interview to each examinee. By recording the test for later scoring, it is possible to ensure that examinees will be rated by the most reliable raters and can be rated under controlled conditions. Raters who have scored both live interviews and SOPIs report that it is often easier to assign a rating to a SOPI performance. In part, this may be because the SOPI can produce a longer speech sample and because each examinee is given the same questions. Thus, distinctions in proficiency may appear more salient to the rater.


The above discussion suggests that the SOPI may offer certain practical and psychometric advantages over a face-to-face interview. Thus, it may be useful to consider the circumstances that motivate the selection of one format or the other. For example, if scores are to be used for placement or diagnosis in an instructional program and a competent interviewer is available, it would seem preferable to administer an OPI. In such a situation, an error in placement can be easily corrected. Similarly, an OPI administered by a competent interviewer may sometimes be preferable for program evaluation purposes because it can provide qualitative information and the score will not have important repercussions for the examinee. On the other hand, if the test is to have important consequences, is to be used for research, or is needed to test a large group of examinees, it may be preferable to administer a SOPI. This is because of the advantages the SOPI can provide in ease of administration and in controlling the reliability of the scoring and the quality of the elicitation procedure.


Clark, J.L.D., & Li, Y.C. (1986). "Development, validation, and dissemination of a proficiency-based test of speaking ability in Chinese and an associated assessment model for other less commonly taught languages." Washington, DC: Center for Applied Linguistics.

Clark, J.L.D., & Swinton, S.S. (1979). "Exploration of speaking proficiency measures in the TOEFL context" (TOEFL Research Rep. No. 4). Princeton, NJ: Educational Testing Service.

Kenyon, D., & Stansfield, C.W. (1993). "Evaluating the efficacy of rater self-training." Washington, DC: Center for Applied Linguistics.

Liskin-Gasparro, J. (1987). "Testing and teaching for oral proficiency." Boston, MA: Heinle and Heinle.

Shohamy, E., Gordon, C., Kenyon, D.M., & Stansfield, C.W. (1989). The development and validation of a semi-direct test for assessing oral proficiency in Hebrew. "Bulletin of Hebrew Higher Education," 4, 4-9.

Stansfield, C.W. (1992). "ACTFL Speaking Proficiency Guidelines. ERIC Digest." Washington, DC: ERIC Clearinghouse on Languages and Linguistics.

Stansfield, C.W., & Kenyon, D.M. (1991). "Development of the Texas Oral Proficiency Test (TOPT)." Final Report. Washington, DC: Center for Applied Linguistics.

Stansfield, C.W., & Kenyon, D.M. (1992). The development and validation of a simulated oral proficiency interview. "Modern Language Journal," 76, 129-141.

Stansfield, C.W., & Kenyon, D.M. (1993). Development and validation of the Hausa Speaking Test with the ACTFL Proficiency Guidelines. "Issues in Applied Linguistics," 4, 5-31.

Stansfield, C.W., Kenyon, D.M., Paiva, D., Doyle, F., Ulsh, I., & Cowles, M.A. (1990). Development and validation of the Portuguese Speaking Test. "Hispania," 73, 641-651.


