Influence of Type of Judge, Normative Information, and Discussion on Standards Recommended for the National Teacher Examinations |
| |
Authors: | John Christian Busch Richard M Jaeger |
| |
Institution: | University of North Carolina at Greensboro |
| |
Abstract: | There are few empirical investigations of the consequences of using widely recommended data collection procedures in conjunction with a specific standardsetting method such as the Angoff (1971) procedure. Such recommendations include the use of several types of judges, the provision of normative information on examinees' test performance, and the opportunity to discuss and reconsider initial recommendations in an iterative standard-setting procedure. This study of 236 expert judges investigated the effects of using these recommended procedures on (a) average recommended test standards, (b) the variability of recommended test standards, and (c) the reliability of recommended standards for seven subtests of the National Teacher Examinations Communication Skills and General Knowledge Tests. Small, but sometimes statistically significant, changes in mean recommended test standards were observed when judges were allowed to reconsider their initial recommendations following review of normative information and discussion. Means for public school judges changed more than did those for college or university judges. In addition, there was a significant reduction in the within-group variability of standards recommended for several subtests. Methods for estimating the reliability of recommended test standards proposed by Kane and Wilson (1984) were applied, and their hypothesis of positive covariation between empirical item difficulties and mean recommended standards was confirmed. The data collection procedures examined in this study resulted in substantial increases in the reliability of recommended test standards. |
| |
Keywords: | |
|
|