首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Equivalent forms of a ten-item completion test were constructed. The same test items then were rewritten in matching format and in multiple-choice format, resulting in two forms (A and B) of each of three types of test. All tests were administered to 73 examinees, and parallel-forms reliability coefficients (correlation between scores on A and B) were calculated. These empirically obtained values were compared to the values of the reliability coefficient predicted from theoretically derived equations which indicate the influence of chance success due to guessing on test reliability. In accordance with theory it was found that the completion test was more reliable than the matching test and that the matching test was more reliable than the multiple-choice test. The empirically obtained reliability coefficients were very close to those predicted from the mathematically derived formulas.  相似文献   

本文通过定量与定性相结合的研究方法,对集库型完形填空和分题选择型完形填空进行比较研究。定量分析的结果表明,受试在两种测试中的分数存在显著差异。然而,有声思维的定性研究结果却显示受试在答题过程中都较多地使用了句内层面的信息来回答问题。其中从句层面的信息运用的最多,而跨句子的语篇层面信息的运用则相对较少,跨语篇层面的信息运用的最少。这一结果对命题及教学提出了新的思考。  相似文献   

We consider the relationship between the multiple-choice and free-response sections on the Computer Science and Chemistry tests of the College Board's Advanced Placement program. Restricted factor analysis shows that the free-response sections measure the same underlying proficiency as the multiple-choice sections for the most part. However, there is also a significant, if relatively small, amount of local dependence among the free-response items that produces a small degree of multidimensionauty for each test  相似文献   

完形填空的练习是英语教学中的重要环节。教师在这一环节中,如能做到精心设计、巩固理解,注重能力、逐步提高,分类指导,强化运用,能够有效地提高学生综合运用英语语言的能力。  相似文献   

完形测试建立于完形心理学的理论基础之上,自创建以来,倍受语言测试界的重视,被广泛用于各种大规模的测试中。但后来的研究表明完形测试也存在一些令人费解之处。本文通过分析Bachman模式揭开了完形测试的神秘面纱。分析表明,完形测试结果并不一定是应试者潜在能力(underlying competence)的真实体现,它还会受到方法因素的影响。该分析结果建议采用完形测试形式的命题者不仅要更好地定义测量目标,而且还要考虑测试中存在哪些潜在难点,尽量避免它们对测试结果带来的影响。  相似文献   


Scoring multipie-choice questions according to the simple scoring systems S1 = R, where R is the number of correct answers, produces an upward bias in scores of poorer students as a result of guessing. The scoring formula conventionally used to adjust for guessing is S2 R-W/(n-1), where W is the number of wrong answers and nis the number of choices per question. However, S2 is based on the unrealistic assumption that on each question the student either knows the correct answer or guesses randomly. On the basis of a more realistic assumption an alternative scoring formula is derived, S4 = [nR + (n-1)Q - Q2/R]/2(n-1), where Q is the number of questions. Compared to S4, the conventional formula (S2) has a downward bias for Q/n < R < Q and the simple formula (S1) has a downward bias for Q/(n-2)<R<Q in addition to its upward bias for R<Q/(n-2).  相似文献   

在英语专业基础课综合英语教学中,采用完形填空的练习方式,做到精心设计,巩固理解,注重能力逐步提高,强化运用,能够有效地提高学生综合运用英语语言的能力.  相似文献   

根据语言测试理论,通过对《大学英语自学教程》(上册)的《同步辅导/同步训练》中部分多项选择题的试验,对影响语法/词汇多项选择题效度的干扰项设计进行探讨和分析。  相似文献   

This article discusses and demonstrates combining scores from multiple-choice (MC) and constructed-response (CR) items to create a common scale using item response theory methodology. Two specific issues addressed are (a) whether MC and CR items can be calibrated together and (b) whether simultaneous calibration of the two item types leads to loss of information. Procedures are discussed and empirical results are provided using a set of tests in the areas of reading, language, mathematics, and science in three grades.  相似文献   

Formula scoring is a procedure designed to reduce multiple-choice test score irregularities due to guessing. Typically, a formula score is obtained by subtracting a proportion of the number of wrong responses from the number correct. Examinees are instructed to omit items when their answers would be sheer guesses among all choices but otherwise to guess when unsure of an answer. Thus, formula scoring is not intended to discourage guessing when an examinee can rule out one or more of the options within a multiple-choice item. Examinees who, contrary to the instructions, do guess blindly among all choices are not penalized by formula scoring on the average; depending on luck, they may obtain better or worse scores than if they had refrained from this guessing. In contrast, examinees with partial information who refrain from answering tend to obtain lower formula scores than if they had guessed among the remaining choices. (Examinees with misinformation may be exceptions.) Formula scoring is viewed as inappropriate for most classroom testing but may be desirable for speeded tests and for difficult tests with low passing scores. Formula scores do not approximate scores from comparable fill-in-the-blank tests, nor can formula scoring preclude unrealistically high scores for examinees who are very lucky.  相似文献   

The matching cloze procedure, which does not require language production skills and which is simple enough for the classroom teacher to construct unaided, was originally developed to measure reading skills of elementary English‐second‐language pupils. The results of this pilot study with opportunity school children indicate the validity of the procedure as an evaluation technique for slow learning children.  相似文献   

This paper describes briefly a methodology for developing multiple-choice critical thinking tests which attempts to overcome certain problems of validity and fairness facing such tests. The paper proposes that direct evidence on test validity be gathered using verbal reports of students' thinking on trial items.  相似文献   

该研究旨在观察考试焦虑对考生考试过程中眼动模式的影响.采用考试焦虑量表(TAS)选取了北京师范大学和北京邮电大学的29名本科生,并根据测量分数分为高、低考试焦虑两组;使用贝克抑郁量表(BDI)以及瑞文标准推理测验分别测量他们的抑郁程度和智商.采用眼动仪记录被试回答以瑞文高级推理测验作为选择题时的注视时间、注视点个数等眼动指标.结果表明:1)高考试焦虑组被试对题干、选项的平均注视时间以及总注视时间均显著长于低考试焦虑组,空白区域则没有表现出显著差异;2)高考试焦虑组被试对所选选项的总注视时间以及其占整体选项总注视时间的比例显著高于低考试焦虑组,然而前者对空白区域总注视时间与其他两区域总注视时间之比则低于低考试焦虑组被试;3)高低考试焦虑组被试在题干区域的注视点个数方面存在显著差异,在空白与选项区域的注视点个数则不存在显著差异.  相似文献   

A statistical test for the detection of answer copying on multiple-choice tests is presented. The test is based on the idea that the answers of examinees to test items may be the result of three possible processes: (1) knowing, (2) guessing, and (3) copying, but that examinees who do not have access to the answers of other examinees can arrive at their answers only through the first two processes. This assumption leads to a distribution for the number of matched incorrect alternatives between the examinee suspected of copying and the examinee believed to be the source that belongs to a family of "shifted binomials." Power functions for the tests for several sets of parameter values are analyzed. An extension of the test to include matched numbers of correct alternatives would lead to improper statistical hypotheses.  相似文献   

Item-response changing as a function of test anxiety was investigated. Seventy graduate students completed the Test Anxiety Scale and 73 multiple-choice items during the quarter. The data supported the hypothesis that high test-anxious students make more item-response changes than low test-anxious students. Results also suggested that both high- and low-anxious students profit to a similar extent proportionally from answer changing. It was further found that more responses were changed on difficult than on easy items for both high- and low-anxious students. Test anxiety is suggested as a factor forming test-taking style.  相似文献   

The answer-until-correct (AUC) method of multiple-choice (MC) testing involves test respondents making selections until the keyed answer is identified. Despite attendant benefits that include improved learning, broad student adoption, and facile administration of partial credit, the use of AUC methods for classroom testing has been extremely limited. This study presents scoring properties and item analysis for 26 AUC university course examinations, administered using a commercial scratch-card response system. Here, we show that beyond the traditional pedagogical advantages of AUC, the availability of partial credit adds psychometric advantages by boosting both the mean item discrimination and overall test-score reliability, when compared to tests scored dichotomously upon initial response. Furthermore we also find a strong correlation between students’ initial-response successes and the likelihood that they would obtain partial credit when they make incorrect initial responses. Thus, partial credit is being granted based on partial knowledge that remains latent in traditional MC tests. The fact that these advantages are realized in real-life classroom tests may motivate further expansion of the use of AUC MC tests in higher education.  相似文献   

在回顾测试中完形填空试题形式的演变及其发展的基础上,采用定性分析方法对2007—2009年大学英语四级完形填空分别从字数、题材、体裁及实词和虚词、句子层面和语篇层面等综合运用能力方面进行内容效度分析,分析结果表明完形填空题的内容效度较高并满足测试目的。  相似文献   

The present study focused on gender differences in the tendency to omit items and to guess in multiple-choice tests. It was hypothesized that males would show greater guessing tendencies than females and that the use of formula scoring rather than the use of number of correct answers would result in a relative advantage for females. Two samples were examined: ninth graders and applicants to Israeli universities. The teenagers took a battery of five or six aptitude tests used to place them in various high schools, and the adults took a battery of five tests designed to select candidates to the various faculties of the Israeli universities. The results revealed a clear male advantage in most subtests of both batteries. Four measures of item-omission tendencies were computed for each subtest, and a consistent pattern of greater omission rates among females was revealed by all measures in most subtests of the two batteries. This pattern was observed even in the few subtests that did not show male superiority and even when permissive instructions were used. Correcting the raw scores for guessing reduced the male advantage in all cases (and in the few subtests that showed female advantage the difference increased as a result of this correction), but this effect was small. It was concluded that although gender differences in guessing tendencies are robust they account for only a small fraction of the observed gender differences in multiple-choice tests. The results were discussed, focusing on practical implications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号