首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Wilcox (16) proposed a latent structure model for answer-until-correct tests that can solve various measurement problems including correcting for guessing without assuming guessing is at random. This paper proposes a closed sequential procedure for estimating true score that can be used in conjunction with an answer-until-correct test. For criterion-referenced tests where the goal is to determine whether an examinee’s true score is above or below a known constant, the accuracy of the new procedure is exactly the same as a more conventional sequential solution. The advantage of the new procedure is that it eliminates the possibility of using an inordinately large number of items when in fact a large number of items is not needed; typical sequential procedures always allow this possibility. In addition, the new procedure appears to compare favorably to traditional tests where the number of items to be administered is fixed in advance.  相似文献   

鉴于种种剽窃行径、特别是东窗事发后的乌贼战术,极度污染了学界的基本生态,应从历史语境的转移中进行澄清。古代社会并不笼统地容许文字挪用,相反,有可能比现代更为严酷,关键看它是否会为相关的特定个体谋得既在预期之中、又在情理之外的名誉或利益。而到了现代社会,鼓励知识创新的个人专利制度,更鉴于学术剽窃带来的种种危害,将它视作罪名独立、不可抵赖的过失。所以,尽管它仍属于道德罪过,学界也必须基于自清原则,让剽窃者得不偿失,让欲为者望而生畏。否则,整个学界的基本规则就会大乱,也就无法指望有序的知识创新。  相似文献   

This cross-cultural study of cloze procedure and comprehension involved samples of 10- to 11-year-old schoolchildren in Canada, Japan, Sweden and the United States. The aim of the study was to explore the nature of what might be called‘cloze comprehension’in relation to overall or‘global comprehension’of a passage; in particular to establish (a) whether cloze procedure measures the same facets of comprehension regardless of what language is being read; and (b) to what extent cloze procedure, in different linguistic areas, measures‘global comprehension', or comprehension of the general ideas contained in a passage, as distinct from literal comprehension. The results of the study indicate that cloze procedure is a valid and reliable measure of certain aspects of reading comprehension in all the linguistic and cultural areas sampled. Furthermore, comprehension as measured by cloze procedure seems to be a necessary, albeit not sufficient, condition for overall or global understanding of the meaning of a passage. The study also shows that the ability measured by cloze procedure is more generalized (i.e., less text specific) than the ability measured by our global comprehension task. One implication of this seems to be that the higher-order skills necessary for global understanding do not always develop automatically once children have mastered the skills necessary for literal comprehension of simple texts. On the contrary, the higher-order skills may have to be taught systematically at an appropriate stage in the children's reading development.  相似文献   


Few reliable and valid measures of reading achievement are available to evaluate programs for elementary English-as-a-second-language (ESL) pupils. Four variations on the cloze procedure, which has been previously used with disadvantaged and ESL elementary pupils, were evaluated using randomly assigned groups of fourth and fifth grade students. Matching and multiple- choice variations were selected for comparison because they are in greater consonance with current psycho- linguistic theories of the reading process than are other types of reading comprehension measures. Although the overall results were quite similar for the four cloze variations examined, the matching cloze procedure seems to be preferable for elementary ESL students since these tests produced better item characteristics and were more easily constructed.  相似文献   

The purpose of this study was to determine if a linear procedure, typically applied to an entire examination when equating scores and reseating judges' standards, could be used with individual item data gathered through Angoffs standard-setting method (1971). Specifically, experts estimates of borderline group performance on one form of a test were transformed to be on the same scale as experts' estimates of borderline group performance on another form of the test. The transformations were based on examinees' responses to the items and on judges' estimates of borderline group performance. The transformed values were compared to the actual estimates provided by a group of judges. The equated and reseated values were reasonably close to those actually assigned by the experts. Bias in the estimates was also relatively small. In general, the reseating procedure was more accurate than the equating procedure, especially when the examinee sample size for equating was small.  相似文献   

Measurement specialists routinely assume examinee responses to test items are independent of one another. However, previous research has shown that many contemporary tests contain item dependencies and not accounting for these dependencies leads to misleading estimates of item, test, and ability parameters. The goals of the study were (a) to review methods for detecting local item dependence (LID), (b) to discuss the use of testlets to account for LID in context-dependent item sets, (c) to apply LID detection methods and testlet-based item calibrations to data from a large-scale, high-stakes admissions test, and (d) to evaluate the results with respect to test score reliability and examinee proficiency estimation. Item dependencies were found in the test and these were due to test speededness or context dependence (related to passage structure). Also, the results highlight that steps taken to correct for the presence of LID and obtain less biased reliability estimates may impact on the estimation of examinee proficiency. The practical effects of the presence of LID on passage-based tests are discussed, as are issues regarding how to calibrate context-dependent item sets using item response theory.  相似文献   


Textual plagiarism is a serious violation of established academic protocols, but it requires considerable writing experience and care to avoid as well. Although student understanding of textual plagiarism and their plagiaristic behaviour in English as a Second Language (ESL) and English as a Foreign Language (EFL) contexts have been quite well researched, few studies have sought to evaluate instructional interventions specifically designed to reduce plagiarism by empowering student writers with a better knowledge of plagiarism and skills at source referencing. To the best of our knowledge, no study to date has ever systematically assessed students’ understanding of plagiarism and their source referencing performance in response to intervention. This classroom-based research, at a university in Beijing, aimed to discover whether a 6-hour block of instruction could facilitate better understanding of plagiarism and appropriate source referencing skills. The results showed that the intervention did generally give students a better appreciation of how textual plagiarism looks and significantly reduced blatant and subtle plagiarism in their writing. However, students’ heavy reliance on original source language did frequently reoccur in student writing if in less clear-cut ways. Some helpful lessons were drawn from this study.  相似文献   


This study investigated the relationship that exists between syntax and reading comprehension. To measure this relationship, data were collected from ninth grade students by administering three tests to them: a cloze test, a chunk test, and a standardized reading test. Analysis of the results indicated that adverbial clause position does not appear to affect the reading comprehension of ninth grade students.  相似文献   


Differences in fifth graders’ reading comprehension scores were obtained using four different tasks typically employed to measure comprehension (multiple - choice, recall, cloze, and maze) and four different reading passages that were equated according to readability formulas. Data analyses revealed significant effects for passage, task, and an interaction between task and passage. It was concluded that the choice of a particular comprehension passage and testing procedure, whether in research or practice, does not allow generalization to other operational definitions of reading comprehension. These results suggest serious limitations of most contemporary reading comprehension research and testing.  相似文献   


We examined change in test-taking effort over the course of a three-hour, five test, low-stakes testing session. Latent growth modeling results indicated that change in test-taking effort was well-represented by a piecewise growth form, wherein effort increased from test 1 to test 4 and then decreased from test 4 to test 5. There was significant variability in effort for each of the five tests, which could be predicted from examinees’ conscientiousness, agreeableness, mastery approach goal orientation, and whether the examinee “skipped” or attended the initial testing session. The degree to which examinees perceived a particular test as important was related to effort for the difficult, cognitive test but not for less difficult, noncognitive tests. There was significant variability in the rates of change in effort, which could be predicted from examinees’ agreeableness. Interestingly, change in test-taking effort was not related to change in perceived test importance. Implications of these results for assessment practice and directions for future research are discussed.  相似文献   

完形填空被认为是一种测试综合语言能力、阅读理解能力的快捷经济的方式。本研究就可能影响完形填空难度的几个变量进行实证探讨, 其中包括语篇类型、删词类型及答题方法。本研究以98 名高三学生为实验对象,完成3 篇填空式完形测试和3 篇选择式测试 测试完成后,笔者对实验数据进行收集、分析, 探究这些变量对完形填空测试难度的影响,并试图在命题难度的把握上找到一种更为合理、科学的测试方法。  相似文献   

Recently, the usage of plagiarism detection software such as Turnitin® has increased dramatically among university instructors. At the same time, academic criticism of this software’s employment has also increased. We interviewed 23 faculty members from various departments at a medium-sized, public university in the southeastern US to determine their perspectives on Turnitin® and student plagiarism. We wanted to discern if there are important disciplinary differences in how instructors define and handle plagiarism; how instructors use Turnitin®; and if instructors’ thinking aligns with ethical and political concerns commonly expressed in the academic literature. Despite varying attitudes towards Turnitin®, those interviewed did not differ significantly in their views as to what student plagiarism is or its seriousness, and typical objections to ‘policing’ plagiarism and Turnitin® had little resonance with interviewees. The majority viewed a substantial amount of plagiarism they encountered as unintentional and penalised only what they considered to be extreme versions of intentional plagiarism. However, often this contradicted the way they presented the concept of plagiarism in their syllabi and their classrooms. Surprisingly, these patterns were consistent among those who employed the software frequently and those who did not.  相似文献   

A simulation study was performed to determine whether a group's average percent correct in a content domain could be accurately estimated for groups taking a single test form and not the entire domain of items. Six Item Response Theory based domain score estimation methods were evaluated, under conditions of few items per content area perform taken, small domains, and small group sizes. The methods used item responses to a single form taken to estimate examinee or group ability; domain scores were then computed using the ability estimates and domain item characteristics. The IRT-based domain score estimates typically showed greater accuracy and greater consistency across forms taken than observed performance on the form taken. For the smallest group size and least number of items taken, the accuracy of most IRT-based estimates was questionable; however, a procedure that operates on an estimated distribution of group ability showed promise under most conditions.  相似文献   

There are several types of cloze. The MC cloze is widely used in national examinations. MC cloze is similar to multiple choice, but not exactly the same. To develop an MC cloze, a suitable passage should be chosen first, then some of the words should be deleted, and finally the distractors for each item are set. To test whether the cloze is validable and reliable, the students are asked to take a pretest. The results are analyzed by GITEST. The data demonstrates that the difficulty level and the discrimination are not good enough. Some of the distractors are too tricky while some others are too weakly distractive.  相似文献   


Understanding adult reading behavior would contribute much to the development of materials and instructional methods. This research used the cloze procedure to compare the types of errors made by skilled and unskilled adult readers with materials at varying levels of difficulty. Both groups of readers made more grammatically unacceptable responses as materials became more difficult, and the effect was stronger for unskilled readers. These findings suggest that the structure of language may have an impact on reading difficulty, and it is suggested that beginning reading materials be drawn from the learner's own language in order to simplify the process of learning to read.  相似文献   

Although the pseudo-random cloze procedure has been in use for some twenty-five years as a measure of readability and reading comprehension, little research has been carried out into the effect of deleting words from text more or less frequently. This paper reports on an experiment in which the deletion frequency variable was systematically studied. Every 6th, 8th, 10th and 12th word was removed from three texts of differing difficulty, and the effect studied. Significant differences among cloze tests resulted, but the differences were unpredictable. Deleting every 12th word did not necessarily result in an easier test than deleting every 6th 8th or 10th word. However, when only items identical to both cloze tests under consideration were compared, no significant differences were found. It appears that cloze items are, on the whole, unaffected by context greater than five words. Testers are warned that changing deletion frequency may result in a different measure of readability or comprehension.  相似文献   

The accuracy of CAT scores can be negatively affected by local dependence if the CAT utilizes parameters that are misspecified due to the presence of local dependence and/or fails to control for local dependence in responses during the administration stage. This article evaluates the existence and effect of local dependence in a test of Mathematics Knowledge. Diagnostic tools were first used to evaluate the existence of local dependence in items that were calibrated under a 3PL model. A simulation study was then used to evaluate the effect of local dependence on the precision of examinee CAT scores when the 3PL model was used for selection and scoring. The diagnostic evaluation showed strong evidence for local dependence. The simulation suggested that local dependence in parameters had a minimal effect on CAT score precision, while local dependence in responses had a substantial effect on score precision, depending on the degree of local dependence present.  相似文献   

An attempt is made to reconcile two historically important tools for the assessment of intelligence and the prediction of academic achievement with extant theories of verbal–crystallized–knowledge aspects of adult abilities. A study of 167 adults ranging in age from 18 to 69 reasserts the importance of individual differences in completion test and cloze test performance in accounting for both measures of crystallized intelligence (Gc) and four scales of knowledge (biology, U.S. history, U.S. literature, and technology). The completion tests were found to account for all of the variance in Gc and knowledge that the cloze tests accounted for, and resulted in incremental predictive validity for both domains. In addition, completion and cloze tests were found to have a suppressor effect on the relationship between Gc and Age. We note that C. Spearman's [The nature of “Intelligence” and the principles of cognition. New York: MacMillan (1927).] assertion, namely that the completion test had higher correlations with intelligence than any other measure. Our results suggest that abstract reasoning may be far less useful in predicting learning and performance than the completion test is.  相似文献   


The perception that Internet plagiarism by university students is on the rise has alarmed college teachers, leading to the adoption of electronic plagiarism checkers, among other responses. Although some recent studies suggest that estimates of online plagiarism may be exaggerated, cause for concern remains. This article reviews quantitative studies of student plagiarism over the past forty years, as well as academe's generally weak response. It also offers strategies for addressing cyber-plagiarism and argues that faculty should act as educators, rather than as detectives.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号