首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
4.
A framework for evaluation and use of automated scoring of constructed‐response tasks is provided that entails both evaluation of automated scoring as well as guidelines for implementation and maintenance in the context of constantly evolving technologies. Consideration of validity issues and challenges associated with automated scoring are discussed within the framework. The fit between the scoring capability and the assessment purpose, the agreement between human and automated scores, the consideration of associations with independent measures, the generalizability of automated scores as implemented in operational practice across different tasks and test forms, and the impact and consequences for the population and subgroups are proffered as integral evidence supporting use of automated scoring. Specific evaluation guidelines are provided for using automated scoring to complement human scoring for tests used for high‐stakes purposes. These guidelines are intended to be generalizable to new automated scoring systems and as existing systems change over time.  相似文献   

5.
6.
7.
What are the validity issues involved in automated scoring of tests? What is the nature of the interplay among construct definition, task design, examinee interface, tutorial, test development tools, and automated scoring and reporting?  相似文献   

8.
9.
Use of the Bayley Scales to characterize abilities of premature infants   总被引:1,自引:0,他引:1  
G Ross 《Child development》1985,56(4):835-842
The Bayley Scales of Infant Development were administered to 92 white, middle-class infants, half of them premature and half full-term, at 1 year of age from term to determine whether this instrument is useful in characterizing the abilities of premature infants. Although both full-term and premature infants achieved mental and motor development scores within the average range, full-term infants attained significantly higher scores on both the Mental and Motor Scales. Both groups scored significantly lower on motor than mental functioning; however, the difference was significantly greater for premature infants. As a group, premature infants also evidenced greater variability in their performance on both the Mental and Motor Scales, and they showed greater intra-individual variability in performance of motor ability. Furthermore, premature infants were less likely to succeed on items testing eye-hand coordination, imitation, and vocalization. Preselected perinatal risk variables accounted for a significant amount of variance in both mental and motor ability of premature infants.  相似文献   

10.
ABSTRACT

The authors compared the performance of third-grade students testing on answer sheets with those testing on machine-scored test booklets. The 1,832 students in the nationally representative sample were assigned at the campus level to complete the Stanford Achievement Test Series, Tenth Edition in 1 of 4 conditions: (a) Form A answer sheet, (b) Form A booklet, (c) Form B answer sheet, and (d) Form B booklet. After controlling for scholastic ability, no significant differences in performance on total reading, total mathematics, and total language strands were found between students using booklets and those using answer sheets. The results of this study provide no evidence to support the need to use separate test booklets with general education third-grade students. States may consider using separate answer sheets with these students to realize potential cost and schedule efficiencies.  相似文献   

11.
12.
13.
14.
15.
This study pioneers a Rasch scoring approach and compares it to a conventional summative approach for measuring longitudinal gains in student learning. In this methodological note, our proposed methodology is demonstrated using an example of rating scales in a student survey as part of a higher education outcome assessment. Such assessments have become increasingly important worldwide for purposes of institutional accreditation and accountability to stakeholders. Data were collected from a longitudinal study by tracking self-reported learning outcomes of individual students in the same cohort who completed the student learning experience questionnaire (SLEQ) in their first and final years. Rasch model was employed for item calibration and latent trait estimation, together with a scaling procedure of concurrent calibration incorporating a randomly equivalent group design and a single group design to measure the gains in self-reported learning outcomes as yielded by repeated measures. The extent to which Rasch scoring compared to the conventional summative scoring method in its sensitivity to change was quantified by a statistical index namely relative performance (RP). Findings indicated greater ability to capture learning outcomes gains from Rasch scoring over the conventional summative scoring method, with RP values ranging from 3 to 17% in the cognitive, social, and value domains of the SLEQ. The Rasch scoring approach and the scaling procedure presented in the study can be readily generalised to studies using rating scales to measure change in student learning in the higher education context. The methodological innovations and contributions of this study are discussed.  相似文献   

16.
17.
18.
19.
反刍思维是指个体经历负性生活事件后或者面对压力事件时,自发性的重复思维的现象.对于反刍思维的理解,反应风格理论、压力应对模型、反刍思维的多维度模型以及悲伤反刍模型和愤怒反刍模型都从不同的角度进行了阐述.关于其产生机制,目标进程理论认为当感觉到重要目标受阻时,个体会进行反刍思维以降低现状与目标之间的落差.根据对反刍思维理论内涵的理解,研究者编制了反刍思维反应方式量表,在研究中得到广泛的应用.今后的研究应在进一步厘清反刍思维概念的基础上,着力探讨中国文化下个体反刍思维的理论维度、特点及其前因变量.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号