共查询到20条相似文献,搜索用时 171 毫秒
1.
2.
王生军 《安徽广播电视大学学报》2004,(3):120-123
应用Rasch模型计算出来的题目难度值与被试样本无关,是题目的一项最重要的量化指标.Rasch模型的题目难度的计算在EXCEL程序中能很方便地完成,本文介绍了详细的计算步骤,并讨论了怎样用题目难度值来估算考生的能力水平. 相似文献
3.
每年的高考题,都有几道给考生留下深刻印象,有一定难度的新题,这些新题将成为当年挑选优等生法宝之一,还能成为来年乃至今后教学的精典范例.2008年给考生产生阻碍的并非难度大的新题,而是见题没有陌生感,读题后就能下笔作答的几道常见题,使一大批考生有的花费远大于分值所给于的考题时间艰难地得到正确结果,有的不得不半途丢之而白白浪费考试时间. 相似文献
4.
5.
《中国远程教育(综合版)》1986,(2)
难度决定测验的信度和效度的最重要因素是考题的质量。衡量一道考题出得好不好,关键是看它的难度和区分度。难度就是考题的难易程度,它可以定量表示。在教育测量理论中,把选择题的难度定义为答对该题的人数的百分比,把非选择题的难度定义为所有考生在该题上的平均分((?))与该题的满分(X_(full))之比,即 相似文献
6.
与传统测量模型相比,Rasch模型因其客观和等距的特点在试卷质量分析中独具优势。本文以南京市小学科学六年级技术与工程素养评测试卷的质量分析为例,从试卷整体质量检验、单维性检验、试卷难度与学生能力的匹配性检验、各题质量分析、题目拟合度和测量误差检验等方面介绍了Rasch模型在试卷质量分析中的应用,同时指出该评测试卷的信效度较高、题目区分度合理,绝大多数题目达到了测量预期。在具体应用中,测量者应依据实际情况选择合适的Rasch分析软件及Rasch模型对应的分析功能;在Rasch模型检测出试卷中的问题项目后,测量者应依据实际情况解释和处理这些问题项目。 相似文献
7.
8.
就2005年高考生物卷而言,试卷的题量、题型、题型比例等方面都与往年教育部命的题相同,题目难度适中,与平时训练题、模拟题相近,没有出现令考生不知所措的高难题,实现了平稳过渡。从考题的设计和质量看,大部分题目出得有水平,以能力测试为主导,考查考生对所学生物课程基础知识、基本技能的掌握程度和运用所学知识分析、解决实际问题的能力。考题中还突出了本土特色,既体现了高考选拔人才的性质,又能稳中求新,对中学新课程教学改革有良好的启示作用。本文对2005年广东高考生物卷试题进行了具体分析,并从中得到一些启示,希望能对2006年的高考备考有所帮助。 相似文献
9.
《历史教学(高校版)》1996,(9)
材料解析题重在对阅读理解能力的考查开津河西区教研室 隋清钧96年高考历史试图与95年考题相比难度有所下降,材料解析题注重检测考生的阅读理解能力,体现了命题意图,既要以教材基础知识为依托,又选择考生可读懂的材料,创设新情景,提出新问题.该道大题的特点,一是均为教材中重大历史事件的基础知识,如太平天国的外交政策、大生产运动、战后当 相似文献
10.
2000年高考实验题考查了三道题,计20分.从湖北省阅卷点抽样分析看出:实验题对考生能力要求较高,全省考生实验题的平均分只有5.3分,难度值为0.27.尤其是第16道电学实验题,满分为8分,全省平均分仅0.54分,难度值为0.07,为历届高考试题难度之最.现将考生对该题的正误解法评点如下. 相似文献
11.
Changes to the design and development of our educational assessments are resulting in the unprecedented demand for a large and continuous supply of content‐specific test items. One way to address this growing demand is with automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer technology. The purpose of this module is to describe and illustrate a template‐based method for generating test items. We outline a three‐step approach where test development specialists first create an item model. An item model is like a mould or rendering that highlights the features in an assessment task that must be manipulated to produce new items. Next, the content used for item generation is identified and structured. Finally, features in the item model are systematically manipulated with computer‐based algorithms to generate new items. Using this template‐based approach, hundreds or even thousands of new items can be generated with a single item model. 相似文献
12.
Students rated the quality of the items on a classroom test that had been taken previously. On the same test, psychometric item indices were calculated. The results showed that the student ratings were related to the item difficulty, but not to the item-test correlation. In addition, the better-achieving students tended to rate the items as less ambiguous. Finally, the ambiguity ratings were more highly related to the item-test correlations for the better achieving students. These findings support opinions held by many instructors of students' judgments of item quality. 相似文献
13.
Testing organization needs large numbers of high‐quality items due to the proliferation of alternative test administration methods and modern test designs. But the current demand for items far exceeds the supply. Test items, as they are currently written, evoke a process that is both time‐consuming and expensive because each item is written, edited, and reviewed by a subject‐matter expert. One promising approach that may address this challenge is with automatic item generation. Automatic item generation combines cognitive and psychometric modeling practices to guide the production of items that are generated with the aid of computer technology. The purpose of this study is to describe and illustrate a process that can be used to review and evaluate the quality of the generated item by focusing on the content and logic specified within the item generation procedure. We illustrate our process using an item development example from mathematics drawn from the Common Core State Standards and from surgical education drawn from the health sciences domain. 相似文献
14.
GUAN Dandan 《中国考试》2008,(7)
主客观题实际上是一个连续体,"主观题客观化"和"客观题主观化"在这个连续体上向对方无限趋近,"客观题主观化"在教育考试中有借鉴意义。文章以我国高考和研究生入学考试的试卷为例,探讨了主观题与客观题比例设置问题。主观题与客观题的有机结合反映了各国考试理念的融合。题型的设计不仅与考查目标有关,还与学科特点有关,并随着认识的深入而发展。 相似文献
15.
Mark J. Gierl Dianne Henderson Michael Jodoin Don Klinger 《Journal of Experimental Education》2013,81(3):261-279
In test development, item response theory (IRT) is a method to determine the amount of information that each item (i.e., item information function) and combination of items (i.e., test information function) provide in the estimation of an examinee's ability. Studies investigating the effects of item parameter estimation errors over a range of ability have demonstrated an overestimation of information when the most discriminating items are selected (i.e., item selection based on maximum information). In the present study, the authors examined the influence of item parameter estimation errors across 3 item selection methods—maximum no target, maximum target, and theta maximum—using the 2- and 3-parameter logistic IRT models. Tests created with the maximum no target and maximum target item selection procedures consistently overestimated the test information function. Conversely, tests created using the theta maximum item selection procedure yielded more consistent estimates of the test information function and, at times, underestimated the test information function. Implications for test development are discussed. 相似文献
16.
We propose a structural equation model, which reduces to a multidimensional latent class item response theory model, for the analysis of binary item responses with nonignorable missingness. The missingness mechanism is driven by 2 sets of latent variables: one describing the propensity to respond and the other referred to the abilities measured by the test items. These latent variables are assumed to have a discrete distribution, so as to reduce the number of parametric assumptions regarding the latent structure of the model. Individual covariates can also be included through a multinomial logistic parameterization for the distribution of the latent variables. Given the discrete nature of this distribution, the proposed model is efficiently estimated by the expectation–maximization algorithm. A simulation study is performed to evaluate the finite-sample properties of the parameter estimates. Moreover, an application is illustrated with data coming from a student entry test for the admission to some university courses. 相似文献
17.
非参数项目反应理论模型包括单调均匀性模型和双单调模型。用单调均匀性模型对某英语听力考试结果研究发现,按照顺序选择法,可从16道听力试题中选出11道满足要求的试题,组成单维量表。用考生在这11道试题上的总得分对考生进行排序与按照潜质排序等效。利用双单调模型对11道听力试题组成的单维量表进行试题功能偏差研究发现,有5道试题在女生子群体中的排序与在男生子群体以及整个群体排序不同,显示女生子群体作出正确应答的概率明显高于男生子群体作出正确应答的概率。这种差异至少部分是由两个子群体听力能力上的差异引起的。 相似文献
18.
试题命制的理论和技术(二) 总被引:1,自引:0,他引:1
大规模教育考试试题命制以心理学的某些理论假设为基础。与这些理论假设一致的试题定义要求试题应该具备三个要素,即测量目标、刺激情境和设问,这三个要素缺失了任何一个,都不能构成完整的试题。根据这些理论假设以及试题定义和要素,本文讨论了命制客观题和主观题的基本要求,客观题包括题干的要求、选项设置的要求以及选项数的问题;主观题包括情境材料的选择、设问、赋分和评分标准制定。 相似文献
19.
This article addresses the issue of how to detect item preknowledge using item response time data in two computer‐based large‐scale licensure examinations. Item preknowledge is indicated by an unexpected short response time and a correct response. Two samples were used for detecting item preknowledge for each examination. The first sample was from the early stage of the operational test and was used for item calibration. The second sample was from the late stage of the operational test, which may feature item preknowledge. The purpose of this research was to explore whether there was evidence of item preknowledge and compromised items in the second sample using the parameters estimated from the first sample. The results showed that for one nonadaptive operational examination, two items (of 111) were potentially exposed, and two candidates (of 1,172) showed some indications of preknowledge on multiple items. For another licensure examination that featured computerized adaptive testing, there was no indication of item preknowledge or compromised items. Implications for detected aberrant examinees and compromised items are discussed in the article. 相似文献
20.
难度不是试题的固有属性,而是考生因素与试题特征之间互动的结果。很多试题分析者倾向于将试题难度偏高的原因仅仅归结于学生未掌握相关知识或技能,而忽视试题本身的特征。通过分析60道难度在0.6以下的高考英语试题,探究其难度来源。结果显示,除考生因素外,难题或偏难题的难度来源也与命题技术有关,比如答案的唯一性与可接受性、考查内容超纲、考点设置与评分标准欠妥等方面的问题。为此,提出考试机构应提高命题水平,加强试题质量监控,确保大规模考试科学选拔人才。 相似文献