首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A Comparison of the Generalizability of Scores Produced by Expert Raters and Automated Scoring Systems
Abstract:The growing use of computers for test delivery, along with increased interest in performance assessments, has motivated test developers to develop automated systems for scoring complex constructed-response assessment formats. In this article, we add to the available information describing the performance of such automated scoring systems by reporting on generalizability analyses of expert ratings and computer-produced scores for a computer-delivered performance assessment of physicians' patient management skills. Two different automated scoring systems were examined. These automated systems produced scores that were approximately as generalizable as those produced by expert raters. Additional analyses also suggested that the traits assessed by the expert raters and the automated scoring systems were highly related (i.e., true correlations between test forms, across scoring methods, were approximately 1.0). In the appendix, we discuss methods for estimating this correlation, using ratings and scores produced by an automated system from a single test form.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号