首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
传统试题难度估计方法一般通过被试的答题情况来后验地估计试题的难度,这种估计方法对样本量有着较高的要求,通过试题本身来估计试题的难度则缺乏统一的标准。本文提出了一种建立试题难度基准的方法,通过对难度影响因素的分析,根据试题结构建立难度基准,然后综合应用贝叶斯方法后验地修正试题难度的估计值,使试题难度的估计值更具有一般性。  相似文献   

3.
《教育实用测度》2013,26(3):199-209
An item is defined as connotatively consistent (CC) or connotatively inconsistent (CI) when its connotation agrees with or contradicts that shared by the majority of items on a test. This definition is more accurate than what have been referred to as negative versus positive items. The study examined the equivalence of CC and CI items using convenience samples of college students' responses to the Life Orientation Test (Scheier & Carver, 1985). Confirmatory factor analysis results showed that CC and CI items measured correlated but distinct traits. Practical and theoretical implications of these findings are discussed.  相似文献   

4.
Graphical Item Analysis (GIA) visually displays the relationship between the total score on a test and the response proportions of the correct and false alternatives of a multiple-choice item. The GIA method provides essential and easily interpretable information about item characteristics (difficulty, discrimination and guessing rate). Low quality items are easily detected with the GIA method because they show response proportions on the correct alternative which decrease with an increase of the total score, or display response proportions of one or more false alternatives which do not decrease with an increase of the total score. The GIA method has two main applications. Firstly, it can be used by researchers in the process of identifying items that need to be excluded from further analysis. Secondly, it can be used by test constructors in the process of improving the quality of the item bank. GIA enables a better understanding of test theory and test construction, especially for those without a background in psychometrics. In this sense, the GIA method might contribute to reducing the gap between the world of psychometrists and the practical world of constructors of achievement tests.  相似文献   

5.
《教育实用测度》2013,26(1):89-97
Research on the use of multiple-choice tests has presented conflicting evidence about the use of statistical item difficulty as a means of ordering items. An alternate method advocated by many texts is the use of cognitive difficulty. This study examined the effect of using both statistical and cognitive item difficulty in determining item order. Results indicated that those students who received items in an increasing cognitive order, no matter what the order of statistical difficulty, scored higher on hard items. Those students who received the forms with opposing cognitive and statistical difficulty orders scored the highest on medium-level items. The study concludes with a call for more research on the effects of cognitive difficulty and suggests that future studies examine subscores as well as total test results.  相似文献   

6.
本研究以普通话水平测试各项内容的难易程度为调研目标,一方面通过问卷调查的方式,调查了全国十所师范院校共2084名在校学生,就难易度的认知分布情况,在不同等级、不同方言区、已测与未测学生间展开比较;另一方面对640名学生普通话水平测试样卷,用百分比和重复测量方差分析法进行分析,以测试项失分指数为参照,考察难易认知情况与实际测试情况的相关程度。问卷调查结果表明,师范生普遍认为最难的测试项游移于"命题说话"和"读单音节字词"之间,最易项均选择"朗读多音节";而测试结果表明,不同等级、不同方言母语学生在"命题说话"项失分最多,其余依次为"读单音节字词"项、"读多音节词语"项,"朗读短文"项失分最少。  相似文献   

7.
This article considers potential problems that can arise in estimating a unidimensional item response theory (IRT) model when some test items are multidimensional (i.e., show a complex factorial structure). More specifically, this study examines (1) the consequences of model misfit on IRT item parameter estimates due to unintended minor item‐level multidimensionality, and (2) whether a Projection IRT model can provide a useful remedy. A real‐data example is used to illustrate the problem and also is used as a base model for a simulation study. The results suggest that ignoring item‐level multidimensionality might lead to inflated item discrimination parameter estimates when the proportion of multidimensional test items to unidimensional test items is as low as 1:5. The Projection IRT model appears to be a useful tool for updating unidimensional item parameter estimates of multidimensional test items for a purified unidimensional interpretation.  相似文献   

8.
从教育测量学的角度对试卷(题)的难度进行了阐述,重点对考试难度选择、不同性质考试和招生决策对试卷(题)难度控制的要求上,存在的几点认识上的误区进行了探讨,为考试评价和学校教学提供教育测量的理论参考。  相似文献   

9.
This study assessed the contributions of various test features (passage variables, question types, and format variables) to reading comprehension performance for successful and unsuccessful readers. Items from a typical standardized reading comprehension test were analyzed according to 20 predictor test features. A three-stage conditional regression approach assessed the predictability of these features on item-difficulty scores for the two reader groups. Two features, location of response information and stem length, accounted for a significant amount of explained variance for both groups. Possible explanatory hypotheses are considered and implications are drawn for improved test design as well as for further research concerning interactions between assessment task features and reader performance.  相似文献   

10.
多项选择完形填空在我国的英语考试中是最常用的题型之一,旨在考查学生的英语综合运用能力。但它历来难度较大得分较低.本文从完形填空命题的复杂性方面来分析其难因,同时提出相关的教学建议,希望有助于英语教师及英语学习者更深入地了解此题型,以利于提高英语的教学和测试质量.  相似文献   

11.
根据统计学原理和机器学习思想,提出试题库试题难度系数多级精确划分,并根据学生答题的正确率自适应调整的设想,同时给出相应的算法和程序实现.  相似文献   

12.
试题的情境化对试题难度的影响   总被引:1,自引:0,他引:1  
近年来,在国内外各级各类考试中试题的情境化趋势非常明显,那么,什么是试题的情境化?试题的情境化对试题难度有着怎样的影响?本文以物理试题为例,详细地剖析了这些问题。  相似文献   

13.
This paper demonstrates, both theoretically and empirically, using both simulated and real test data, that sets of items can be selected that meet the unidimensionality assumption of most item response theory models even though they require more than one ability for a correct response. Sets of items that measure the same composite of abilities as defined by multidimensional item response theory are shown to meet the unidimensionality assumption. A method for identifying such item sets is also presented  相似文献   

14.
Problem-solving strategy is frequently cited as mediating the effects of response format (multiple-choice, constructed response) on item difficulty, yet there are few direct investigations of examinee solution procedures. Fifty-five high school students solved parallel constructed response and multiple-choice items that differed only in the presence of response options. Student performance was videotaped to assess solution strategies. Strategies were categorized as "traditional"–those associated with constructed response problem solving (e.g., writing and solving algebraic equations)–or "nontraditional"–those associated with multiple-choice problem solving (e.g., estimating a potential solution). Surprisingly, participants sometimes adopted nontraditional strategies to solve constructed response items. Furthermore, differences in difficulty between response formats did not correspond to differences in strategy choice: some items showed a format effect on strategy but no effect on difficulty; other items showed the reverse. We interpret these results in light of the relative comprehension challenges posed by the two groups of items.  相似文献   

15.
《Educational Assessment》2013,18(2):133-147
This article presents a beginning effort to build a taxonomy for constructed-response test items. The taxonomy defines the categories for various item formats in three distinct dimensions: (a) type of reasoning competency employed, (b) nature of cognitive continuum employed, and (c) kind of response yielded. Each dimension is described, and the reasons for incorporating it into the taxonomy are explained. A theoretical rationale for the taxonomy is developed, and advantages and shortcomings of its use are noted.  相似文献   

16.
17.
18.
高中信息技术学业水平合格性考试作为标准参照性考试,命题过程需要按照考试目标及要求做好难度控制,通过准确预估试题难度控制试卷难度,实现考试结果与考试目标的一致。命题难度控制技术包括试题的难度预估、试卷难度的控制。通过确定影响难度的主要客观因素、设计简便易行的试题难度计算方法、建立试题难度预估的参照模型等三个环节探究试题难度预估的方法,结合实例进一步探究试卷难度的控制技术。  相似文献   

19.
Federal policy on alternate assessment based on modified academic achievement standards (AA-MAS) inspired this research. Specifically, an experimental study was conducted to determine whether tests composed of modified items would have the same level of reliability as tests composed of original items, and whether these modified items helped reduce the performance gap between AA-MAS eligible and ineligible students. Three groups of eighth-grade students (N?=?755) defined by eligibility and disability status took original and modified versions of reading and mathematics tests. In a third condition, the students were provided limited reading support along with the modified items. Changes in reliability across groups and conditions for both the reading and mathematics tests were determined to be minimal. Mean item difficulties within the Rasch model were shown to decrease more for students who would be eligible for the AA-MAS than for non-eligible groups, revealing evidence of differential boost. Exploratory analyses indicated that shortening the question stem may be a highly effective modification, and that adding graphics to reading items may be a poor modification.  相似文献   

20.
冯渊 《考试研究》2013,(6):9-16
中高考语文试卷中的“图文组合材料”题是常见的非连续性文本阅读题,这类试题的命题思路和命题技术已相对成熟,但也存在明显不足。本文以PISA阅读测试和中高考卷中的非连续性文本阅读题为例,剖析其异同,探讨PISA试题对中高考非连续性文本阅读题设计的启示。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号