首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
It is observed that many sorts of difficulties may preclude the uneventful construction of tests by a computerized algorithm, such as those currently in favor in Computerized Adaptive Testing (CAT). In this essay we discuss a number of these problems, as well as some possible avenues of solution. We conclude with the development of the "testlet," a bundle of items that can be arranged either hierarchically or linearly, thus maintaining the efficiency of an adaptive test while keeping the quality control of test construction that is possible currently only with careful expert scrutiny. Performance on the separate testlets is aggregated to yield ability estimates.  相似文献   

2.
How does the use of computerized adaptive testing affect the performance of students from different groups? How consistent were the results of computerized adaptive and “conventional” tests? What did the students think about the test experience? What advice do the authors have for test developers and users?  相似文献   

3.
Educational Testing Service A multiple-choice test item is identified as flawed if it has no single best answer. In spite of extensive quality control procedures, the administration of flawed items to test takers is inevitable. A limited set of common strategies for dealing with flawed items in conventional testing, grounded in the principle of fairness to examinees, is reexamined in the context of adaptive testing. An additional strategy, available for adaptive testing, of retesting from a pool cleansed of flawed items, is compared to the existing strategies. Retesting was found to be no practical improvement over current strategies.  相似文献   

4.
三、CAT中对的估计(一)MLE(极大似然估计法)假设一个能力水平为θ的被试对n道项目X_1,X_2,…,X_n作答。θ的估计可以通过使(8)式所示的似然函数最大化的方式来得到。令(?)_n为此时所得的θ估计。显然(?)_n也是(9)式的极大似然估计。已知在一定的条件下,(?)_n符合渐进正态,其均值为θ,方差近似为I~(-1)_n((?)_n)。目前的CAT设计大多通过递归方式在被试回答一个新的项目之后得到最新的θ估计,并根据信息最大化法抽取下一个项目。  相似文献   

5.
《教育实用测度》2013,26(4):287-304
Computerized adaptive testing, although well-grounded in psychometric theory, has had few large-scale applications in the past. This is now changing because the cost of computing has declined rapidly. As is always true at such junctures where theory is translated into practice, many practical issues arise that must now be addressed. In this article, we discuss a number of such issues and sketch out potential problems and potential solutions. Our purpose is to encourage further development of solutions to the issues presented as well as other practical issues facing measurement professionals involved with the implementation of adaptive testing.  相似文献   

6.
The computerized adaptive testing (CAT) has unsurpassable advantages over the traditional testing. It has become the mainstream in large scale examination in modem society. This paper gives a brief introduction to CAT including differences between traditional testing and CAT, the principals of CAT works, Psychometric theory and computer algorithms of CAT, the advantages and cautions of CAT. In the end, the development of CAT in China is reviewed.  相似文献   

7.
计算机化自适应测验(CAT)在理论与实践中得到广泛应用。目前许多CAT研究可以归纳为两种研究范式:实测作答的CAT研究范式和测验作答数据模拟的CAT研究范式。CAT模拟研究方法的步骤有模型选择、题库模拟、测试起点、选题策略、测验终止策略等。CAT模拟研究的主要趋势有:选题策略、终止策略仍然是CAT研究的重点;CAT模拟研究的设计内容更适合实际测验情况;CAT研究设计采取多因素设计;模拟结果多方面综合评价等。  相似文献   

8.
The alignment between a test and the content domain it measures represents key evidence for the validation of test score inferences. Although procedures have been developed for evaluating the content alignment of linear tests, these procedures are not readily applicable to computerized adaptive tests (CATs), which require large item pools and do not use fixed test forms. This article describes the decisions made in the development of CATs that influence and might threaten content alignment. It outlines a process for evaluating alignment that is sensitive to these threats and gives an empirical example of the process.  相似文献   

9.
近年来由于信息技术的进步,采用计算机自适应测试进行评价得到迅速的发展;此外,移动技术的可用性也为评价提供了新的途径。文章设计并开发了面向多类终端的自适应测试系统,在项目选择过程中充分考虑了已有算法所存在的部分项目曝光率高、题库利用率低、内容平衡等问题,重新设计了项目选择引擎。通过该系统可以为形成性评估、总结性评估和自我评估提供支持。  相似文献   

10.
Two new methods for item exposure control were proposed. In the Progressive method, as the test progresses, the influence of a random component on item selection is reduced and the importance of item information is increasingly more prominent. In the Restricted Maximum Information method, no item is allowed to be exposed in more than a predetermined proportion of tests. Both methods were compared with six other item-selection methods (Maximum Information, One Parameter, McBride and Martin, Randomesque, Sympson and Hetter, and Random Item Selection) with regard to test precision and item exposure variables. Results showed that the Restricted method was useful to reduce maximum exposure rates and that the Progressive method reduced the number of unused items. Both did well regarding precision. Thus, a combined Progressive-Restricted method may be useful to control item exposure without a serious decrease in test precision.  相似文献   

11.
Error indices (bias, standard error of estimation, and root mean squared error) obtained on different measurement scales under different test-termination rules in computerized adaptive testing (CAT) were examined. Four ability estimation methods (maximum likelihood estimation, weighted likelihood estimation, expected a posterior, and maximum a posterior), three measurement scales (θ, number-correct score, and ACT score), and three test-termination rules (fixed length, fixed standard error, and target information) were studied for a real and a generated item pool. The findings indicated that the amount and direction of bias, standard error of estimation, and root mean squared error obtained under different ability estimation methods were influenced both by scale transformations and by test-termination rules in a CAT environment. The implications of these effects for testing programs are discussed.  相似文献   

12.
本文结合专家经验确定法和项目反应理论,设计出一种简明、实用的计算机自适应考试系统的试题难度确定方法,同时重点分析计算机自适应考试系统的测试起点、终点选择,选题策略和能力值估计方法。最后列举了一个自适应测试的步骤实例。本系统能够根据不同能力被试者随机选择试题项目,减少了测试长度,与传统在线考试系统相比提高了考试效率。  相似文献   

13.
Computerized adaptive testing (CAT) is a testing procedure that adapts an examination to an examinee's ability by administering only items of appropriate difficulty for the examinee. In this study, the authors compared Lord's flexilevel testing procedure (flexilevel CAT) with an item response theory-based CAT using Bayesian estimation of ability (Bayesian CAT). Three flexilevel CATs, which differed in test length (36, 18, and 11 items), and three Bayesian CATs were simulated; the Bayesian CATs differed from one another in the standard error of estimate (SEE) used for terminating the test (0.25, 0.10, and 0.05). Results showed that the flexilevel 36- and 18-item CATs produced ability estimates that may be considered as accurate as those of the Bayesian CAT with SEE = 0.10 and comparable to the Bayesian CAT with SEE = 0.05. The authors discuss the implications for classroom testing and for item response theory-based CAT.  相似文献   

14.
介绍了项目反应理论(IRT)的基本理论和计算机化自适应测试(CAT)的实现过程。并在Visual Stu-dio.net2003的环境下,以SQL作为后台数据库,以三参数Logistic模型为项目反应模型,开发了一个基于WEB的CAT系统。  相似文献   

15.
This article describes a computerized adaptive test (CAT) based on the uniform item exposure multi-form structure (uMFS). The uMFS is a specialization of the multi-form structure (MFS) idea described by Armstrong, Jones, Berliner, and Pashley (1998 Armstrong, R. D., Jones, D. H., Berliner, N. and Pashley, P. June 1998. Computerized adaptive tests with multiple form structures, June, Champaign-Urbana, IL: Paper presented at the annual meeting of the Psychometric Society.  [Google Scholar]). In an MFS CAT, the examinee first responds to a small fixed block of items. The items comprising that block may be unrelated to each other, or they may comprise a testlet (Wainer and Kiely, 1987 Wainer, H. and Kiely, G. L. 1987. Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24: 185201. [Crossref], [Web of Science ®] [Google Scholar]) After the first block of items has been administered, adaptation takes place in the choice of the next block to be administered and subsequent blocks. The uMFS design integrates item exposure control, as well as content balancing and other test development needs, into the design of the CAT, instead of placing those activities in the online implementation. We show that it is possible to implement item exposure control, in a very thorough way, in the psychometric specifications of the item blocks.  相似文献   

16.
《现代教育技术》精品课程自适应测试系统的设计   总被引:3,自引:0,他引:3  
评价方式的改革是当前教育教学改革的重要内容之一,本文在阐述项目反应理论的基础上,给出一种基于三参数逻辑斯蒂模型的自适应在线测试系统的体系结构,分析了该系统的题库建立过程、选题算法、能力评估算法以及测试终止条件,并针对<现代教育技术>国家精品课程设计了自适应测试的原型系统MET-CATS,分析了系统自适应测试的运行过程和评价过程.  相似文献   

17.
计算机化自适应多阶段测试是精准减负的一种有效手段,因为它会自动引导学生尽可能作答与其能力水平相适应的题目,从而节省出作答太难或太易题目所浪费的大量时间和精力.不过,我国目前的一些计算机测试系统缺乏现代测评技术的有力支撑,部分题库在知识内容和能力维度的标识与编码、题目参数的估计与等值,以及分数的算法与使用方面存在着较大缺陷.本文简要分析了自适应测试的基本模式、操作流程、使用条件和主要优点,具体讨论了计算机化自适应多阶段测试系统的设计,以及基于测验总分的单参数logistic模型和基于作答反应模式的双参数logistic模型的算分方法,为提升计算机化自适应测试的水平,进而促进教师因材施教、减轻学生作业负担和考试负担提供了考试科学视角下的新办法.  相似文献   

18.
This study compared the properties of five methods of item exposure control within the purview of estimating examinees' abilities in a computerized adaptive testing (CAT) context. Each exposure control algorithm was incorporated into the item selection procedure and the adaptive testing progressed based on the CAT design established for this study. The merits and shortcomings of these strategies were considered under different item pool sizes and different desired maximum exposure rates and were evaluated in light of the observed maximum exposure rates, the test overlap rates, and the conditional standard errors of measurement. Each method had its advantages and disadvantages, but no one possessed all of the desired characteristics. There was a clear and logical trade-off between item exposure control and measurement precision. The Stocking and Lewis conditional multinomial procedure and, to a slightly lesser extent, the Davey and Parshall method seemed to be the most promising considering all of the factors that this study addressed.  相似文献   

19.
基于认知诊断自适应测试(CD-CAT)的教育测量技术能够为学生个性化学习提供帮助,有助于做到因材施教。目前我国已开展基于CD-CAT教育辅助系统的开发和使用,但与其他国家和地区相比较仍有差距。扩大教育测量专业人员队伍,加强CD-CAT在理论上的创新研究、在实践上的应用,开发更加适合个人、更加开放灵活的智能学习系统是我国教育测量的未来发展方向。  相似文献   

20.
Simulations of computerized adaptive tests (CATs) were used to evaluate results yielded by four commonly used ability estimation methods: maximum likelihood estimation (MLE) and three Bayesian approaches—Owen's method, expected a posteriori (EAP), and maximum a posteriori. In line with the theoretical nature of the ability estimates and previous empirical research, the results showed clear distinctions between MLE and the Bayesian methods, with MLE yielding lower bias, higher standard errors, higher root mean square errors, lower fidelity, and lower administrative efficiency. Standard errors for MLE based on test information underestimated actual standard errors, whereas standard errors for the Bayesian methods based on posterior distribution standard deviations accurately estimated actual standard errors. Among the Bayesian methods, Owen's provided the worst overall results, and EAP provided the best. Using a variable starting rule in which examinees were initially classified into three broad/ability groups greatly reduced the bias for the Bayesian methods, but had little effect on the results for MLE. On the basis of these results, guidelines are offered for selecting appropriate CAT ability estimation methods in different decision contexts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号