首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
计算机化自适应多阶段测试是精准减负的一种有效手段,因为它会自动引导学生尽可能作答与其能力水平相适应的题目,从而节省出作答太难或太易题目所浪费的大量时间和精力.不过,我国目前的一些计算机测试系统缺乏现代测评技术的有力支撑,部分题库在知识内容和能力维度的标识与编码、题目参数的估计与等值,以及分数的算法与使用方面存在着较大缺陷.本文简要分析了自适应测试的基本模式、操作流程、使用条件和主要优点,具体讨论了计算机化自适应多阶段测试系统的设计,以及基于测验总分的单参数logistic模型和基于作答反应模式的双参数logistic模型的算分方法,为提升计算机化自适应测试的水平,进而促进教师因材施教、减轻学生作业负担和考试负担提供了考试科学视角下的新办法.  相似文献   

2.
In computerized adaptive testing (CAT), ensuring the security of test items is a crucial practical consideration. A common approach to reducing item theft is to define maximum item exposure rates, i.e., to limit the proportion of examinees to whom a given item can be administered. Numerous methods for controlling exposure rates have been proposed for tests employing the unidimensional 3-PL model. The present article explores the issues associated with controlling exposure rates when a multidimensional item response theory (MIRT) model is utilized and exposure rates must be controlled conditional upon ability. This situation is complicated by the exponentially increasing number of possible ability values in multiple dimensions. The article introduces a new procedure, called the generalized Stocking-Lewis method, that controls the exposure rate for students of comparable ability as well as with respect to the overall population. A realistic simulation set compares the new method with three other approaches: Kullback-Leibler information with no exposure control, Kullback-Leibler information with unconditional Sympson-Hetter exposure control, and random item selection.  相似文献   

3.
Two new methods for item exposure control were proposed. In the Progressive method, as the test progresses, the influence of a random component on item selection is reduced and the importance of item information is increasingly more prominent. In the Restricted Maximum Information method, no item is allowed to be exposed in more than a predetermined proportion of tests. Both methods were compared with six other item-selection methods (Maximum Information, One Parameter, McBride and Martin, Randomesque, Sympson and Hetter, and Random Item Selection) with regard to test precision and item exposure variables. Results showed that the Restricted method was useful to reduce maximum exposure rates and that the Progressive method reduced the number of unused items. Both did well regarding precision. Thus, a combined Progressive-Restricted method may be useful to control item exposure without a serious decrease in test precision.  相似文献   

4.
This study compared the properties of five methods of item exposure control within the purview of estimating examinees' abilities in a computerized adaptive testing (CAT) context. Each exposure control algorithm was incorporated into the item selection procedure and the adaptive testing progressed based on the CAT design established for this study. The merits and shortcomings of these strategies were considered under different item pool sizes and different desired maximum exposure rates and were evaluated in light of the observed maximum exposure rates, the test overlap rates, and the conditional standard errors of measurement. Each method had its advantages and disadvantages, but no one possessed all of the desired characteristics. There was a clear and logical trade-off between item exposure control and measurement precision. The Stocking and Lewis conditional multinomial procedure and, to a slightly lesser extent, the Davey and Parshall method seemed to be the most promising considering all of the factors that this study addressed.  相似文献   

5.
Educational Testing Service A multiple-choice test item is identified as flawed if it has no single best answer. In spite of extensive quality control procedures, the administration of flawed items to test takers is inevitable. A limited set of common strategies for dealing with flawed items in conventional testing, grounded in the principle of fairness to examinees, is reexamined in the context of adaptive testing. An additional strategy, available for adaptive testing, of retesting from a pool cleansed of flawed items, is compared to the existing strategies. Retesting was found to be no practical improvement over current strategies.  相似文献   

6.
《教育实用测度》2013,26(4):287-304
Computerized adaptive testing, although well-grounded in psychometric theory, has had few large-scale applications in the past. This is now changing because the cost of computing has declined rapidly. As is always true at such junctures where theory is translated into practice, many practical issues arise that must now be addressed. In this article, we discuss a number of such issues and sketch out potential problems and potential solutions. Our purpose is to encourage further development of solutions to the issues presented as well as other practical issues facing measurement professionals involved with the implementation of adaptive testing.  相似文献   

7.
The alignment between a test and the content domain it measures represents key evidence for the validation of test score inferences. Although procedures have been developed for evaluating the content alignment of linear tests, these procedures are not readily applicable to computerized adaptive tests (CATs), which require large item pools and do not use fixed test forms. This article describes the decisions made in the development of CATs that influence and might threaten content alignment. It outlines a process for evaluating alignment that is sensitive to these threats and gives an empirical example of the process.  相似文献   

8.
9.
The use of computerized adaptive testing algorithms for ranking items (e.g., college preferences, career choices) involves two major challenges: unacceptably high computation times (selecting from a large item pool with many dimensions) and biased results (enhanced preferences or intensified examinee responses because of repeated statements across items). To address these issues, we introduce subpool partition strategies for item selection and within-person statement exposure control procedures. Simulations showed that the multinomial method reduces computation time while maintaining measurement precision. Both the freeze and revised Sympson-Hetter online (RSHO) methods controlled the statement exposure rate; RSHO sacrificed some measurement precision but increased pool use. Furthermore, preventing a statement's repetition on consecutive items neither hindered the effectiveness of the freeze or RSHO method nor reduced measurement precision.  相似文献   

10.
How does the use of computerized adaptive testing affect the performance of students from different groups? How consistent were the results of computerized adaptive and “conventional” tests? What did the students think about the test experience? What advice do the authors have for test developers and users?  相似文献   

11.
直放站的使用虽然有很多优点怛是当收发天线隔离度不够时会出现自激现象,对网络造成严重影响。因此本文主要实现对CDMA直放站收发天线之间隔离度检测功能。当CDMA直放站天线隔离度低干直放站正常工作所要求时,检测出当前实际天线隔离度大小,并根据当前实际天线隔离度大小提出了自适应的隔离度检测算法,消除CDMA直放站自激现象,提高系统性能。  相似文献   

12.
Small‐ and medium‐sized enterprises (SMEs) play an important role in creating a dynamic and successful European economy. Time‐poor managers in these organisations generally have fewer opportunities for training and development than their counterparts in larger organisations. As a result, different requirements are placed on training. The aim of this study was to test the principles of action learning in a virtual environment. The action‐learning programme was based on virtual working but did also involve face‐to‐face workshops, thus providing a blended approach. The project was designed to be “evaluation‐led”, with evaluation progressing alongside the project from design to finalisation. The focus of this paper is on how the evaluation‐led approach unfolded. To this end, we start by explaining our research approach, we then move on to an analysis of the project to conclude with a discussion of the findings and of the lessons learnt. We conclude by highlighting some further research needs.  相似文献   

13.
The computerized adaptive testing (CAT) has unsurpassable advantages over the traditional testing. It has become the mainstream in large scale examination in modem society. This paper gives a brief introduction to CAT including differences between traditional testing and CAT, the principals of CAT works, Psychometric theory and computer algorithms of CAT, the advantages and cautions of CAT. In the end, the development of CAT in China is reviewed.  相似文献   

14.
Simulations of computerized adaptive tests (CATs) were used to evaluate results yielded by four commonly used ability estimation methods: maximum likelihood estimation (MLE) and three Bayesian approaches—Owen's method, expected a posteriori (EAP), and maximum a posteriori. In line with the theoretical nature of the ability estimates and previous empirical research, the results showed clear distinctions between MLE and the Bayesian methods, with MLE yielding lower bias, higher standard errors, higher root mean square errors, lower fidelity, and lower administrative efficiency. Standard errors for MLE based on test information underestimated actual standard errors, whereas standard errors for the Bayesian methods based on posterior distribution standard deviations accurately estimated actual standard errors. Among the Bayesian methods, Owen's provided the worst overall results, and EAP provided the best. Using a variable starting rule in which examinees were initially classified into three broad/ability groups greatly reduced the bias for the Bayesian methods, but had little effect on the results for MLE. On the basis of these results, guidelines are offered for selecting appropriate CAT ability estimation methods in different decision contexts.  相似文献   

15.
《现代教育技术》2016,(3):100-106
针对英语词汇自适应测试系统中词汇难度如何量化的现实需求,文章提出了从词频、长度、语音书写和谐程度这三个维度来量化英语词汇难度的具体方法,并以普通高中英语词汇为例展示了其量化过程。经过对各个难度子区间的词汇频次进行统计后发现,其结果近似呈现正态分布。  相似文献   

16.
樊军 《考试研究》2012,(4):61-67
计算机自适应性测试中的连续概率比例试模式,是一种适用于普通教师利用网络技术在班级教学这样的小规模测试中评估学生语言学习效果的测试模式。其基本原理就是估计被试连续测试时答对与答错的概率,然后与“掌握”和“未掌握”两个相互对立的假设作比较而产生相应的决策。它一方面可以弥补基于IRT测试模式应用范围的不足,_另一方面可以更好地帮助教师完成对于学生语言能力的评估。  相似文献   

17.
What is the rationale for adapting an existing testing system instead of developing your own? What are the limitations of MicroCAT? What has to be modified in order to meet local needs and to realize the potential of adaptive testing in the context of an existing testing system?  相似文献   

18.
随着新课程标准的实施,在新编生物教学大纲中,注重了理论和实践相结合的原则,教材内容突出表现"三多"即实验内容多,实验形式多,实验要求多的特点.在生物实验教学中,为使每个实验达到教学大纲的要求,必须加大生物实验教学的力度,强化实验规则和技能.  相似文献   

19.
针对遗传算法的改进 ,提出了一种新的评价种群过早收敛程度的指标 ,进而给出一种新的自适应调整策略。仿真计算表明 ,该方法较一般遗传算法和一般自适应遗传算法有较大提高  相似文献   

20.
普通话水平测试是一项国家级的测试,属于政府行为。要保证普通话水平测试的信度和效度,必须从以下几个方面完善它的机制:1.建立完整的试题库,实行教考分离的原则;2.细化评分的标准,使评分有据可依;3.加强测试员队伍的培养和考核;4.加强考前辅导工作,做好考后复审工作;5.启动大中小学教师队伍的测试工作,为以后公务员的测试做好准备。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号