共查询到20条相似文献,搜索用时 15 毫秒
1.
The purpose of this study was to investigate the connotation of performance labels used in standard setting. For example, do the performance labels basic, proficient, and advanced hold different connotations than limited knowledge, satisfactory, and distinguished?If these terms hold different connotations, such differences may play a role in the standard-setting process. A nationally representative sample of participants (n = 167) provided connotation ratings to an online instrument containing an experimental manipulation. Results suggested that the selected terms themselves do hold different connotations. After definitions were provided with the terms, the differences in the evaluative nature of the labels were mitigated. However, some differences remained; the term limited knowledge was persistently perceived as less favorable than basic and apprentice, and satisfactory was persistently perceived as less favorable than proficient. 相似文献
2.
3.
《教育实用测度》2013,26(1):107-120
Environmental regulation, like educational policy, crucially depends on the establishment and enforcement of standards. One can observe numerous similarities in the interplay of social, political, and technical issues in educational and environmental standard setting. In this article, I review several major types of environmental standards (design, performance, exposure, safety, and behavioral) and discuss their points of contact with educational standards. In this article, I highlight areas of judgment common to both standard-setting processes and describe the principal mechanisms that are used to improve the credibility of environmental standards. In conclusion, I suggest ways in which experiences gained in the environmental arena could usefully be extended to the educational arena. 相似文献
4.
James S. Terwilliger 《Educational Measurement》1989,8(2):15-19
What is the process by which grades should be assigned to students? Which learning objectives should be the basis for pass/fail decisions? Which learning objectives should be the basis for assigning better grades? How might this procedure be implemented? 相似文献
5.
Since 1971 there have been a number of studies in which a cut score has been set using a method proposed by Angoff (1971). In this method, each member of a panel of judges estimates for each test question the proportion correct for a specific target group of examinees. Prior and contemporary research suggests that this is a difficult task for judges. Angoff also proposed that judges simply indicate whether or not an examinee from the target group will be able to answer each question correctly (the yes/no method). We report on the results of two studies that compare a yes/no estimation with a proportion correct estimation. The two studies demonstrate that both methods produce essentially equal cut scores and that judges find the yes/no method more comfortable to use than the estimated proportion correct method. 相似文献
6.
对HSK部分等级的验证性研究 总被引:1,自引:0,他引:1
中国汉语水平考试(HSK)的作用之一是界定留学生进入中国大学入系学习时所应具备的汉语能力。根据有关规定,HSK三级和六级分别是进入中国大学理工西医类和文史中医类入系学习的最低标准。本文采用安哥夫、边缘组及对照组三种方法对此标准进行了验证性研究。 相似文献
7.
比喻是一种重要的修辞格,同时也是产生词义的一种重要方式.通过比喻产生的词义我们称之为比喻义.比喻义是词语的比喻用法的固定化.词典在处理词义的时候不可避免的会涉及到比喻义.比喻义的处理是否得当也关系到词典质量的高低.<现代汉语规范词典>在对词语比喻义的处理方式上,本着客观的原则,依据词语的词义,灵活的释义,取得了很好的效果,但是同时也存在一些问题,不过成就是主要的. 相似文献
8.
Michael B. Bunch 《Educational Measurement》2020,39(2):111-112
In this digital ITEMS module, Dr. Michael Bunch provides an in-depth, step-by-step look at how standard setting is done. It does not focus on any specific procedure or methodology (e.g., modified Angoff, bookmark, and body of work) but on the practical tasks that must be completed for any standard setting activity. Dr. Bunch carries the participant through every stage of the standard setting process, from developing a plan, through preparations for standard setting, conducting standard setting, and all the follow-up activities that must occur after standard setting in order to obtain the approval of cut scores and translate those cut scores into score reports. The digital module includes a 120-page manual, various ancillary files (e.g., PowerPoint slides, Excel workbooks, sample documents, and forms), links to datasets from the book Standard Setting (Cizek & Bunch, 2007), links to final reports from four recent large-scale standard setting events, quiz questions with formative feedback, and a glossary. 相似文献
9.
Chad W. Buckendahl Russell W. Smith James C. Impara Barbara S. Plake 《Journal of Educational Measurement》2002,39(3):253-263
This article presents a comparison of simplified variations on two prevalent methods, Angoff and Bookmark, for setting cut scores on educational assessments. The comparison is presented through an application with a Grade 7 Mathematics Assessment in a midwestem school district. Training and operational methods and procedures for each method are described in detail along with comparative results for the application. An alternative item ordering strategy for the Bookmark method that may increase its usability is also introduced. Although the Angoff method is more widely used, the Bookmark method has some promising features, specifically in educational settings. Teachers are able to focus on the expected performance of the "barely proficient" student without the additional challenge of estimating absolute item dificulty. 相似文献
10.
The purpose of this study was to determine if a linear procedure, typically applied to an entire examination when equating scores and reseating judges' standards, could be used with individual item data gathered through Angoffs standard-setting method (1971). Specifically, experts estimates of borderline group performance on one form of a test were transformed to be on the same scale as experts' estimates of borderline group performance on another form of the test. The transformations were based on examinees' responses to the items and on judges' estimates of borderline group performance. The transformed values were compared to the actual estimates provided by a group of judges. The equated and reseated values were reasonably close to those actually assigned by the experts. Bias in the estimates was also relatively small. In general, the reseating procedure was more accurate than the equating procedure, especially when the examinee sample size for equating was small. 相似文献
11.
Adam E. Wyse 《Educational Measurement》2015,34(2):47-54
This article uses data from a large‐scale assessment program to illustrate the potential issue of range restriction with the Bookmark method in the context of trying to set cut scores to closely align with a set of college and career readiness benchmarks. Analyses indicated that range restriction issues existed across different response probability (RP) values and item response theory (IRT) models if one were to apply the Bookmark procedure using intact test forms. Results also suggested that range restriction may still be present if one had access to additional data from an item bank. This demonstration critically highlights challenges that may exist in some practical applications of the Bookmark method due items not being designed to cover the full range of examinee abilities. 相似文献
12.
标准设定:步骤、方法与评价指标 总被引:1,自引:0,他引:1
标准设定(standard setting)是划分标准的过程,指在测验分数分布中划分出两类或两类以上的分界分数。通过标准设定,考生可以被分为“通过”和“未通过”,或者是被分为更多的有序表现类别。标准设定是标准参照测验的重要组成部分,也可为测验决策者提供关于测验效度的依据,是目前测量领域一个颇受关注的研究问题。本文首先回顾了标准设定的源起和发展历程,然后详细地介绍了标准设定的基本步骤和几种主要的标准设定方法,评估标准设定过程的指标,最后简单论述了在国内各类考试中应用标准设定的必要性。 相似文献
13.
Essential for the validity of the judgments in a standard-setting study is that they follow the implicit task assumptions. In the Angoff method, judgments are assumed to be inversely related to the difficulty of the items; contrasting-groups judgments are assumed to be positively related to the ability of the students. In the present study, judgments from both procedures were modeled with a random-effects probit regression model. The Angoff judgments showed a weaker link with the position of the items on the latent scale than the contrasting-groups judgments with the position of the students. Hence, in the specific context of the study, the contrasting-groups judgments were more aligned with the underlying assumptions of the method than the Angoff judgments . 相似文献
14.
天津市初等信息技术考试是面向社会测试应试者计算机应用能力的评测系统,作为一种标准参照考试,从2004年开始实施以来,一直以60分作为合格标准,但实践证明,60分并不能作为判断考生是否合格的永恒标准。该考试系统是上机考试,社会考生自愿报名参加,考试对象年龄差异较大,覆盖小学2-6年级,且每个级别会有不同年龄学生参加,60分的划界分数忽略了每次参加测试的被试者的平均能力不同这一事实,也忽略了同一次考试不同考生抽到的题目不完全一致的事实。这样可能会产生一个问题,即我们只能了解考生的相对能力和相对位置。如果不能正确地将考生归入恰当的等级类别中,这种等级考试的价值就会受很大影响。因此,本文对该考试系统的"合格"标准分数的设定进行研究,利用Angoff法设定划界分数,客观地应用到被试群体中,在提高考试信度、效度的研究与应用方面进行了有益的探索。 相似文献
15.
随着新一轮课程改革的不断向前推进,高考数学命题已从理论和实践上发生了深刻的变化.纵观近几年高考及各地模拟试题,立体几何无疑为数学学科高考改革与创新提供了一块肥沃的"土壤"和"试验田",有时甚至成为高考改革的"风向标",笔者以近几年高考及各地模拟试题为例,结合<考试大纲>和新课程的教学理念,分析高考命题的变化特点,供参考. 相似文献
16.
张卫红 《山东商业职业技术学院学报》2007,7(2):21-22,33
介绍了会计准则的制定主体模式,对政府主体模式和民间主体模式进行评价。对制定主体模式进行经济学分析,在承认目前我国政府主体模式合理性的同时,提出了其发展趋势是:应加强民间专业团体的参与。 相似文献
17.
This paper reports two studies of standard setting using Angoff's method. Results of the first study suggest that specialization within broad content areas does not affect an expert's estimates of the performance of the borderline group. This is reassuring because the knowledge base of many professions is so large that no individual can be considered an expert in all aspects of it. Results of the second study support the recommendation that performance data be provided during the standard-setting process. They are frequently used by experts, but will not have an impact on the standard unless the distribution of item difficulties is skewed markedly. It also increases the correspondence between p-values and estimates of borderline group performance, thereby reducing errors in pass/fail decisions. Overall, the results support recommendations often made in standard-setting literature, but they need to be replicated with other groups of experts 相似文献
18.
柯森 《华南师范大学学报(社会科学版)》2007,(5):110-118
课程标准可能以“族”的形态也可能以“体系”的形态出现和发挥作用;“体系”是课程标准更完善和有力的组织形态,以内部结构的整体性和关联性为基础;类型结构和层次结构是课程标准体系的两种基本结构,并按一定的关系或方式构成某种整体结构;对于课程标准体系的整体结构而言,包含各种特定类型和层次的标准或标准成分是基本而重要的,但更重要的是它们之间的关系,是它们有无建立应有的“连接”,以及这种连接达到了何种程度。所有这些都从新的高度或维度,为课程标准体系的认识、评价和建构提供了指引。 相似文献
19.
In test-centered standard-setting methods, borderline performance can be represented by many different profiles of strengths and weaknesses. As a result, asking panelists to estimate item or test performance for a hypothetical group study of borderline examinees, or a typical borderline examinee, may be an extremely difficult task and one that can lead to questionable results in setting cut scores. In this study, data collected from a previous standard-setting study are used to deduce panelists’ conceptions of profiles of borderline performance. These profiles are then used to predict cut scores on a test of algebra readiness. The results indicate that these profiles can predict a very wide range of cut scores both within and between panelists. Modifications are proposed to existing training procedures for test-centered methods that can account for the variation in borderline profiles. 相似文献
20.
In this article we address the issue of consistency in standard setting in the context of an augmented state testing program. Information gained from the external NRT scores is used to help make an informed decision on the determination of cut scores on the state test. The consistency of cut scores on the CRT across grades is maintained by forcing a consistency model based on the NRT scores and translating that information back to the CRT scores. The inconsistency of standards and the application of this model are illustrated using data from the Maryland MSA large state testing program involving cut points for basic, proficient and advanced in mathematics and reading across years and across grades. The model is discussed in some detail and shown to be a promising approach, although not without assumptions that must be made and issues that might be raised. 相似文献