首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
数学学科教育质量监测试题的质量关系到基础教育质量监测的科学性与权威性,从质性和量化两方面对其进行评价分析,提出相关的评价标准和评价策略,为我国的考试改革和教育质量监测提供重要借鉴。  相似文献   

2.
结合化学高考试题特征及已有研究对鲍建生试题难度模型进行修改,制定高考化学试题难度量化工具的基本结构和评定过程,并运用该量化工具对2016年全国新课标卷Ⅰ、Ⅱ进行难度分析.结果显示:该量化工具不仅能衡量试题难度,而且能具体指出试题难度差异的主要表现,这为试题事前难度分析和试题修正提供了新思路.  相似文献   

3.
学业质量要求和标准的提出为基于学科核心素养的学业评价和试题命制提供了重要依据,而试题情境是考查的重要载体。以“江苏省2021年新高考适应性考试生物学试题”为例,对试题情境的类型及作用进行分析,为新高考生物学试题的命制提供思路和方向。  相似文献   

4.
试题是测量被考者的知识、智力和技能的一种工具。本文首先分析了影响自学考试试题质量的因素,并在考试大纲的编制、试题题型的设定、命题和试题质量评测等试题质量控制的主要环节,提出试题质量控制的方法和采用的技术,为确保试题的高质量设计了有关可操作的解决方案。  相似文献   

5.
本文通过对数学中考试题的情境进行量化分析,探讨命题策略.研究结果表明,情境量化分析可以有效地帮助教师了解学生对数学题目的理解和应用能力,从而指导命题策略的制定,本文的探究对于提高数学中考试题的质量和有效性具有重要意义.  相似文献   

6.
自主招生试题质量评价标准是当前考试研究领域的一个重要课题。以2014年某大学Q类自主招生数学试卷的统计分析为基础,提出基于数据的自主招生试题质量评价标准。主要涉及试题的难度、区分度、鉴别指数、目标群体得分率差等方面。分析表明,难度和区分度等整体指标在评价自主招生试题质量时存在一定的局限性,需要与反映目标群体区分情况的得分率差分析相结合。  相似文献   

7.
学生的学业评价一直以来都是衡量教育质量的标准,对于评价方式有严格要求。以科学素养这一领域的生物样本试题为例,对国际学生评价项目PISA试题的设计进行分析,从而为我国的生物学业水平测试命题提供参考,提高评价的有效性和可靠性。  相似文献   

8.
以2023年广东高考物理实验试题为例,探索基于“学业质量水平”的试题分析,依据“学业质量水平”提出相应的教学策略。  相似文献   

9.
通过比较分析,发现提高试题信度和教学效率,构建高效课堂需要遵循一些共同原则.为解读试题信度特征,提高试题信度,充分发掘试题的价值,同时,也为了提升教学效率,构建高效课堂,结合具体题例,对这些共同原则进行了提炼总结.具体表现为:课标为本,教材为用;强化基础,拾级而上;与时俱进,导向鲜明.期待助力试题质量和教学效率的提升.  相似文献   

10.
试题和成绩分析对教学管理具有十分重要的作用,通过获取的量化信息的反馈,对教师进一步完善教学,提高教学质量以及提高命题质量都大有裨益。试题分析常用的指标有信度、效度、难度、区分度;成绩分析常用的指标有平均成绩、及格率、最高分、最低分、成绩分布。本文对这些指标作了分析并给出了相应的实现程序,为教学管理带来了方便。  相似文献   

11.
Differential linear drift of item location parameters over a 10 -year period is demonstrated in data from the College Board Physics Achievement Test. The relative direction of drift is associated with the content of the items and reflects changing emphasis in the physics curricula of American secondary schools. No evidence of drift of discriminating power parameters was found. Statistical procedures for detecting, estimating, and accounting for item parameter drift in item pools for long-term testing programs are proposed  相似文献   

12.
叶萌 《考试研究》2010,(2):96-107
本文对项目反应理论(IRT)局部独立性问题的主要研究成果进行了文献梳理。在此基础上,阐释局部独立性假设的定义。文章同时就局部独立性与测验维度的关系,局部依赖的甄别与计算、起因和控制程序,以及局部依赖对测量实践的影响进行讨论,并探讨了题组中局部题目依赖问题的解决策略。  相似文献   

13.
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change, test level, test content, and item format. As a follow-up to the real data analysis, a simulation study was performed to assess the effect of item position change on equating. Results from this study indicate that item position change significantly affects change in RID. In addition, although the test construction procedures used in the investigated state seem to somewhat mitigate the impact of item position change, equating results might be impacted in testing programs where other test construction practices or equating methods are utilized.  相似文献   

14.
The effect of item parameters (discrimination, difficulty, and level of guessing) on the item-fit statistic was investigated using simulated dichotomous data. Nine tests were simulated using 1,000 persons, 50 items, three levels of item discrimination, three levels of item difficulty, and three levels of guessing. The item fit was estimated using two fit statistics: the likelihood ratio statistic (X2B), and the standardized residuals (SRs). All the item parameters were simulated to be normally distributed. Results showed that the levels of item discrimination and guessing affected the item-fit values. As the level of item discrimination or guessing increased, item-fit values increased and more items misfit the model. The level of item difficulty did not affect the item-fit statistic.  相似文献   

15.
News Item     
《欧洲特需教育杂志》2013,28(2):228-231
  相似文献   

16.
考试题库的制作   总被引:1,自引:0,他引:1  
题库建设是发展大型常设性考试的一项重要工作。本文对如何制作高质量题库作了探讨,包括时命题组织、试题命制、审题、试测分析、入库等各个重要的环节都作了详细的论述,具有较强的操作性。  相似文献   

17.
Studies that have investigated differences in examinee performance on items administered in paper-and-pencil form or on a computer screen have produced equivocal results. Certain item administration procedures were hypothesized to be among the most important variables causing differences in item performance and ultimately in test scores obtained from these different administration media. A study where these item administration procedures were made as identical as possible for each presentation medium is described. In addition, a methodology is presented for studying the difficulty and discrimination of items under each presentation medium as a post hoc procedure.  相似文献   

18.
Dodeen (2004) studied the correlation between the item parameters of the three-parameter logistic model and two item fit statistics, and found some linear relationships (e.g., a positive correlation between item discrimination parameters and item fit statistics) that have the potential for influencing the work of practitioners who employ item response theory. This article examines the same type of linear relationships as studied by Dodeen. However, this article adds to the literature by employing item fit statistics not considered by Dodeen, which have been recently suggested and whose Type I error rates have been demonstrated to be generally close to the nominal level. Detailed simulations show that if one uses certain of the recently suggested item fit statistics, there is no need to worry about any linear relationships between the item parameters and item fit statistics.  相似文献   

19.
Increasing use of item pools in large-scale educational assessments calls for an appropriate scaling procedure to achieve a common metric among field-tested items. The present study examines scaling procedures for developing a new item pool under a spiraled block linking design. The three scaling procedures are considered: (a) concurrent calibration, (b) separate calibration with one linking, and (c) separate calibration with three sequential linking. Evaluation across varying sample sizes and item pool sizes suggests that calibrating an item pool simultaneously results in the most stable scaling. The separate calibration with linking procedures produced larger scaling errors as the number of linking steps increased. The Haebara’s item characteristic curve linking resulted in better performances than the test characteristic curve (TCC) linking method. The present article provides an analytic illustration that the test characteristic curve method may fail to find global solutions in polytomous items. Finally, comparison of the single- and mixed-format item pools suggests that the use of polytomous items as the anchor can improve the overall scaling accuracy of the item pools.  相似文献   

20.
Computer packages that assist the test developer in writing items, generating tests, and building item banks are critically examined. There appears to be a lack of fully integrated software packages for item writing. Although there are many test generators, they do not really assist the test developer in checking the wording of items. Packages are available, however, for building quality item banks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号