Abstract: | Building achievement tests which are sensitive to the instructional effects of school programs concerns both practitioners and researchers in education. To produce such tests, empirical procedures to guide item selection are needed. In this paper, an operational framework and a set of empirical procedures for this task are presented. Within this framework, item sensitivity is linked to instructional implementation. A simple components of variance model has been used to provide actual estimates of instructional sensitivity. These procedures are illustrated using data from a comparative study of alternative item formats for a criterion-referenced test. Even when items were closely matched to instructional content specifications, important differences in instructional sensitivity emerged. These differences were found between the same items presented in different formats as well as between different items presented within the same format. Implications of these results for developing criterion-referenced achievement tests are discussed. |