首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
计算机自适应测验中Rasch模型稳健性的模拟研究   总被引:1,自引:0,他引:1  
本研究采用模拟数据的方法,在计算机自适应测验(Computer Adaptive Test,简称CAT)中分别采用Rasch及Birnbaum两种模型估计能力,通过比较两者的误差均方根(Root Mean Square Error,简称RMSE)、平均差异(Average Deviation,简称AD)及能力相关,对Rasch模型在CAT中的稳健性进行了研究。结果发现Rasch模型在区分度不等的条件下仍然能较准确地估计被试的能力水平,具有很强的稳健性。  相似文献   

2.
随着教育测试理论的发展和计算机的普及,计算机自适应测试得到越来越广泛的研究和应用。基于项目反应理论,采用三参数对数模型进行自适应测试系统的设计和分析。  相似文献   

3.
基于计算机的测验已逐渐普及,但不同的计算机测验形式在测量相同任务时可能会产生测验结果的偏差,从而导致教育测量与评价结果的不公平性。文章基于项目反应理论,探讨了计算机化线性测验与计算机自适应测验在测验效率、测验结果的统计学特征及其对考生个体心理特质的影响是否等效等问题,并以师范生"现代教育技术"课程为例开展了实证研究,结果显示:两种测验中考生的分数具有可比性,计算机自适应测验具有更高的测验效率与测验信度,但有无即时反馈对考生测验焦虑的影响较大;而计算机化线性测验具有更合理的内容效度,有无即时反馈对考生测验焦虑的影响较小。文章的研究不仅对教学评价中测验形式的选择是否公平合理进行了科学分析,而且为施测者根据测验场景有针对性地选择测验形式提供了理论参考。  相似文献   

4.
The computerized adaptive testing (CAT) has unsurpassable advantages over the traditional testing. It has become the mainstream in large scale examination in modem society. This paper gives a brief introduction to CAT including differences between traditional testing and CAT, the principals of CAT works, Psychometric theory and computer algorithms of CAT, the advantages and cautions of CAT. In the end, the development of CAT in China is reviewed.  相似文献   

5.
樊军 《考试研究》2012,(4):61-67
计算机自适应性测试中的连续概率比例试模式,是一种适用于普通教师利用网络技术在班级教学这样的小规模测试中评估学生语言学习效果的测试模式。其基本原理就是估计被试连续测试时答对与答错的概率,然后与“掌握”和“未掌握”两个相互对立的假设作比较而产生相应的决策。它一方面可以弥补基于IRT测试模式应用范围的不足,_另一方面可以更好地帮助教师完成对于学生语言能力的评估。  相似文献   

6.
《教育实用测度》2013,26(4):381-405
In recent years, there has been a large increase in the number of university applicants requesting special accommodations for university entrance exams. The Israeli National Institute for Testing and Evaluation (NITE) administers a Psychometric Entrance Test (comparable to the Scholastic Assessment Test in the United States) to assist universities in Israel in selecting undergraduates. Because universities in Israel do not permit flagging of candidates receiving special testing accommodations, such scores are treated as identical to scores attained under regular testing conditions. The increase in the number of students receiving testing accommodations and the prohibition of flagging have brought into focus certain psychometric issues pertaining to the fairness of testing students with disabilities and the comparability of special and standard testing conditions. To address these issues, NITE has developed a computerized adaptive psychometric test for administration to examinees with disabilities. This article discusses the process of developing the computerized test and ensuring its comparability to the paper-and-pencil test. This article also presents data on the operational computerized test.  相似文献   

7.
本研究采用实证方法研究命题质量影响因素。命题质量主要取决于教师素质、教学行为、命题行为、命题环境、命题态度等影响因素的协同作用;命题行为受教学行为的直接影响较大;命题环境支持对教师的命题态度和考试结果有很大影响。研究结论为提高命题管理水平提供了实证参考。  相似文献   

8.
9.
10.
北京高等教育质量状况的实证研究   总被引:1,自引:0,他引:1  
本文以<北京高等教育质量报告>的数据统计为基础,结合我国现阶段高等学校教学水平评估的具体指标,从师资队伍、经费和硬件设施、教学改革、学生四个方面对"十五"期间北京高等教育及其质量进行研究和分析,其中前两个方面关注的是质量保障所需要的条件,后两个方面关注的更多的是质量本身.  相似文献   

11.
This study attempted to better define trick questions and see if students could differentiate between trick and not–trick questions. Phase 1 elicited definitions of trick questions so as to identify essential characteristics. Seven components were found. Phase 2 obtained ratings to see which components of trick questions were considered to be most crucial. The intention of the item constructor and the fact that the questions had multiple correct answers received highest ratings from students. Phase 3 presented a collection of statistics items, some of which were labeled on an a priori basis as being trick or not–trick. The analysis indicated that examinees were able to statistically differentiate between trick and not–trick items, but the difference compared to chance was small. Not–trick items were more successfully sorted than trick items, and trick items that were classified as intentional were sorted about as well as nonintentional items. Evidence seems to suggest that the concept of trickiness is not as clear as some test construction textbook authors suggest.  相似文献   

12.
The purpose of this study is to show the usefulness of cognitive diagnoses for remedial instruction. Cognitive diagnoses were done by an adaptive testing system using the rule-space methodology, which was developed by K. K. Tatsuoka and her associates (K. K. Tatsuoka, 1983, 1990; K. K. Tatsuoka & M. M. atsuoka, 1987; M. M. Tatsuoka & K. K. Tatsuoka, 1989). The results of the study strongly indicate that knowing students'knowledge states prior to remediation is very effective and that the rule-space method can effectively diagnose students' knowledge states and can point out ways for remediating their errors quickly with minimum effort. It is also found that the design of instructional units for remediation can be effectively guided by the rule-space model, because the determination of all possible knowledge states in a domain of interest, given an incidence matrix, is based on a partially ordered tree structure of knowledge states, which is equivalent to item-score patterns determined logically from the incidence matrix.  相似文献   

13.
The purpose of this article is to present an analytical derivation for the mathematical form of an average between-test overlap index as a function of the item exposure index, for fixed-length computerized adaptive tests (CATs). This algebraic relationship is used to investigate the simultaneous control of item exposure at both the item and test levels. The results indicate that, in fixed-length CATs, control of the average between-test overlap is achieved via the mean and variance of the item exposure rates of the items that constitute the CAT item pool. The mean of the item exposure rates is easily manipulated. Control over the variance of the item exposure rates can be achieved via the maximum item exposure rate (rmax). Therefore, item exposure control methods which implement a specification of rmax (e.g., Sympson & Hetter, 1985) provide the most direct control at both the item and test levels.  相似文献   

14.
文章是关于大规模计算机辅助英语口语测试效果的实证研究报告。文章首先通过对比发现,计算机系统自动化判分与教师评分所得成绩的相关度为0.911,说明计算机评分基本可代替教师评分完成直接型口试任务。其次采用定量和定性分析方法,从受试者和教师角度对大规模计算机口语测试的效度和信度进行分析,论证了高校口语机考的可行性和整体测试效果。  相似文献   

15.
PISA2018阅读素养的计算机化自适应测试采用核心阶段、阶段1和阶段2的3阶段自适应测试,题库设定有245道题目,组成45个测试单元,并将其组合成若干题组,用于不同阶段的测试。在路径设计上,为避免位置效应问题,除核心阶段→阶段1→阶段2的标准路径之外,还采用核心阶段→阶段2→阶段1的替代路径。PISA2018阅读素养计算机化自适应测试建立了能力覆盖范围较为完整的题库,减少了学生群体差异带来的测量误差,进一步提高了测量效度和效率,但在更大范围推广、人工评分题目的信息无法用于测试选题等方面,计算机化自适应测试仍需要进一步探索。  相似文献   

16.
The top‐down approach to designing a multistage test is relatively understudied in the literature and underused in research and practice. This study introduced a route‐based top‐down design approach that directly sets design parameters at the test level and utilizes the advanced automated test assembly algorithm seeking global optimality. The design process in this approach consists of five sub‐processes: (1) route mapping, (2) setting objectives, (3) setting constraints, (4) routing error control, and (5) test assembly. Results from a simulation study confirmed that the assembly, measurement and routing results of the top‐down design eclipsed those of the bottom‐up design. Additionally, the top‐down design approach provided unique insights into design decisions that could be used to refine the test. Regardless of these advantages, it is recommended applying both top‐down and bottom‐up approaches in a complementary manner in practice.  相似文献   

17.
针对大学生在大学英语口语网考中的不良情绪表现与心理反应,此研究随机抽取339名参加过大学英语口语网考的本科生进行心理素质研究。以问卷调查、个案分析、对比研究等方法,探讨了影响大学生英语口语网考成绩的心理因素及其效应,进而提出了强化大学生英语口语网考心理素质的教育服务模式与措施。  相似文献   

18.
Successful administration of computerized adaptive testing (CAT) programs in educational settings requires that test security and item exposure control issues be taken seriously. Developing an item selection algorithm that strikes the right balance between test precision and level of item pool utilization is the key to successful implementation and long‐term quality control of CAT. This study proposed a new item selection method using the “efficiency balanced information” criterion to address issues with the maximum Fisher information method and stratification methods. According to the simulation results, the new efficiency balanced information method had desirable advantages over the other studied item selection methods in terms of improving the optimality of CAT assembly and utilizing items with low a‐values while eliminating the need for item pool stratification.  相似文献   

19.
为了解高校民族预科生英语学习现状,提高民族预科生教学质量,以辽宁省某高校外国语学院英语专业民族预科生105人为被试,调查发现:(1)专业学习从低年级到高年级明显呈下降趋势;(2)近七成预科生学习属于能力目标型;(3)学习自我效能期望、学习策略等也存在问题与不足。据此提出提升高校民族预科生学习能力的对策有:(1)培养学生兴趣,提升自我效能期望;(2)传授与训练科学高效的学习策略;(3)把握课堂动态,夯实专业基础;(4)找准专业弱势,突破专业瓶颈等。  相似文献   

20.
作为一种典型的增长模型,纵向量表化(Vertical Scaling,也称垂直等值、垂直标定等)方法常用于评估被试的学业或能力发展状况。本研究以新疆少数民族四至六年级学生在2011年至2013年三次学业水平质量监测汉语考试中的答题数据为样本,采取共同题设计进行数据收集,并运用Thurstone方法和IRT同时标定的方法进行量表分数构建,最终完成了三个年级间的分数连接,实现了对新疆双语班四至六年级学生汉语学业水平增长的测量,为学业水平监测工作提供了可参考的量化指标。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号