首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
An approach to essay grading based on signal detection theory (SDT) is presented. SDT offers a basis for understanding rater behavior with respect to the scoring of construct responses, in that it provides a theory of psychological processes underlying the raters' behavior. The approach also provides measures of the precision of the raters and the accuracy of classifications. An application of latent class SDT to essay grading is detailed, and similarities to and differences from item response theory (IRT) are noted. The validity and utility of classifications obtained from the SDT model and scores obtained from IRT models are compared. Validity coefficients were found to be about equal in magnitude across SDT and IRT models. Results from a simulation study of a 5-class SDT model with eight raters are also presented.  相似文献   

Although much attention has been given to rater effects in rater‐mediated assessment contexts, little research has examined the overall stability of leniency and severity effects over time. This study examined longitudinal scoring data collected during three consecutive administrations of a large‐scale, multi‐state summative assessment program. Multilevel models were used to assess the overall extent of rater leniency/severity during scoring and examine the extent to which leniency/severity effects were stable across the three administrations. Model results were then applied to scaled scores to estimate the impact of the stability of leniency/severity effects on students’ scores. Results showed relative scoring stability across administrations in mathematics. In English language arts, short constructed response items showed evidence of slightly increasing severity across administrations, while essays showed mixed results: evidence of both slightly increasing severity and moderately increasing leniency over time, depending on trait. However, when model results were applied to scaled scores, results revealed rater effects had minimal impact on students’ scores.  相似文献   

评分标准在写作测试中非常重要,使用不同的评分方法会影响评卷者的评分行为。研究显示,虽然整体法和分析法两种英语写作评分方法都可靠,但是在两种评分中,评卷者的严厉程度以及考生的写作成绩发生很大变化。总体上,整体法评分中,评卷者的严厉程度趋于一致,接近理想值;分析法评分中,考生的写作成绩更高,同时评卷者的严厉程度也存在显著差异。因而,在决定考生前途命运的重大考试中,整体评分法更受推崇。  相似文献   

本研究通过实验探索人工智能评测技术在人工网上评卷质量监控中的应用及其他相关应用。实验数据采集自2017年安徽省高考语文作文和英语作文共计841 610份试卷,对智能阅卷产生的机器评分、普通高考人工网上评卷产生的人工1评和人工2评以及报道分进行平均分、标准差、相关度、评分一致率等多个维度的数据分析,将智能阅卷产生的异常作答样本和大分差数据样本反馈给学科专家组进行质检评分。结果表明:智能阅卷基本上达到了与评卷教师相当的水平;智能阅卷始终采用统一的评分标准,更具客观公正性,能为人工网上评卷提供有效的质量监控。  相似文献   

In most U.S. schools, teachers are evaluated using observation of teaching practice (OTP). This study investigates rater effects on OTP ratings among 421 principals in an authentic teacher evaluation system. Many-facet Rasch analysis (MFR) using a block of shared ratings revealed that principals generally (a) differentiated between more and less effective teachers, (b) rated their teachers with leniency (i.e., overused higher rating categories), and (c) differentiated between teaching practices (e.g., Cognitive Engagement vs. Classroom Management) with minimal halo effect. Individual principals varied significantly in degree of leniency, and approximately 12% of principals exhibited severe rater bias. Implications for use of OTP ratings for evaluating teachers’ effectiveness are discussed. Strengths and limitations of MFR to analyze rater effects in OTP are also discussed.  相似文献   

在经济全球化、一体化的大背景下,产业的发展也呈现出新的特征。产业的群聚发展、可持续发展以及科技化趋势构成了当今产业发展的主旋律。面对中国即将成为世界制造中心的历史机遇,处于中部地区的河南省在加快工业化、城镇化,推进农业现代化进程中,应结合产业发展趋势,打造新型产业集群,推进产业生态化进程,加快产业信息化,实现中原崛起。  相似文献   

从测量学角度来看,高考作文因其评分主观性强影响了对考生写作能力甚至是语文能力的测量。如何改革作文才能进一步减小评分误差、提高考试的公平性,是落实此次考试招生制度改革的一项具体任务。研究一表明,与西方采用的小评分量表相比,我国高考采用的60分制大评分量表评分趋中效应更为严重,评分标准更为宽松,不同评分者对评分标准的掌握一致性较差,据此建议改革高考作文评分量表的设计,将目前的大评分量表改为小评分量表,成绩单独报告。研究二表明,增加作文任务数量有助于明显提高评分信度,据此建议将高考作文由一个大作文变为一大一小两个作文。  相似文献   

With the rise of democratic institutions and the propagation of consumerism in Asia, the teacher-student relationship has undergone fundamental changes. As a response to the demand for public accountability, course evaluation has been recently adopted as routine in universities in Hong Kong, placing Hong Kong in the forefront of this trend in Asia. Few studies in the educational literature examine whether such a practice applies to cultural settings outside the West, particularly in Asia where the teacher-student relationship is often paternalistic. Using a large dataset collected in Hong Kong, this study examines how Chinese students behave in the course evaluation process. Results suggest that Chinese students, like their Western counterparts, are able to distinguish separate dimensions of teaching quality. Due to their cultural background, however, they pay more attention in their evaluations to the personal qualities of their teachers.  相似文献   

课堂人际知觉作为影响学生学习兴趣的重要变量,有利于从"课堂"场域出发探讨学习兴趣培养的行动路径。以重庆地区1048名4-6年级学生为研究对象,通过多水平结构方程模型的分析方法,从"个体"和"班级"两个水平出发探讨课堂人际知觉的四个维度(教师投入知觉、学生投入知觉、师生关系知觉和同伴协作知觉)对学生学习兴趣的影响。研究发现,学生对教师投入的知觉和师生关系的知觉只对学生个体的学习兴趣产生影响,但对学生投入的知觉和同伴协作的知觉却不仅能够影响学生个体的学习兴趣,更能影响班级学生群体的学习兴趣。基于此,提出要树立"因班施教"的学习兴趣培养模式,加强"班级"层面因素对学生学习兴趣培养的作用,促进学生学习兴趣培养的视野转向。  相似文献   

英语作文自动评分及其效度、信度与可操作性探讨   总被引:2,自引:0,他引:2  
评述国内外作文自动评分系统,并依据英语作文测试中的信度、效度和实践可操作性对其进行分析。探讨国内英语作文自动评阅系统的发展,在肯定其优点的同时,指出和分析其中的问题和不足,并提出相应之对策,以期为我国英语作文自动评阅系统研发提供借鉴和启迪。  相似文献   

When good model-data fit is observed, the Many-Facet Rasch (MFR) model acts as a linking and equating model that can be used to estimate student achievement, item difficulties, and rater severity on the same linear continuum. Given sufficient connectivity among the facets, the MFR model provides estimates of student achievement that are equated to control for differences in rater severity. Although several different linking designs are used in practice to establish connectivity, the implications of design differences have not been fully explored. Research is also limited related to the impact of model-data fit on the quality of MFR model-based adjustments for rater severity. This study explores the effects of linking designs and model-data fit for raters on the interpretation of student achievement estimates within the context of performance assessments in music. Results indicate that performances cannot be effectively adjusted for rater effects when inadequate linking or model-data fit is present.  相似文献   

自动作文评分系统的技术优势为英语写作教学模式的创新改革提供一个良好的平台。本研究对基于自动作文评分系统的英语写作教学模式进行了设计与教学实践,包括写前阶段、初稿和同伴互评阶段、修改和自动评阋阶段、课堂讲评和定稿阶段的设计。为期一年的写作教学实验表明:新的写作教学模式督促学生写,保持写作的频率,激发学生的写作兴趣,培养学生自主写作能力,提高学生英语写作水平。  相似文献   

在主观题评分过程中,评分者效应随时间、场合或任务产生波动,即发生评分者漂移。本研究基于一个高利害性大规模教育考试的作文评分现场收集的操作性数据,借助传统检测方法侦测可能存在的趋中漂移和不准确性漂移,比较不同效应指标的结果。结果表明:在所检测的写作任务上,评分员在整体上并未发生明显的评分者漂移,但有相当比例的个体显示出波动;对于趋中漂移,残期相关和残模相关的效果高度一致;对于不准确性漂移,相关系数类指标对准确性提升的指示并不灵敏;动态效应并非是静态效应的简单加合,评分员是否发生评分者漂移并不取决于其静态效应,准确性较高的评分员发生改变的倾向相对较低。  相似文献   

也果散文注重个人立场,强调叙述对散文的独特意义与价值,善于借鉴现代叙述学理论与创作技巧进行散文书写。从叙述学角度来看,其作品多以呈现型叙述营造某种特定语境;拒绝传统散文的主体性叙事,善于在叙述视角的多元游走与转换中,表达个体对日常世界的理性审视与真实发现。也果散文对当代沂蒙文学的叙述维度形成了一种有力冲击与抻拉,为沂蒙地域文学甚至是齐鲁文学都注入了一种新鲜的、不同以往的先锋写作意识。  相似文献   

对口支援西部地区高等学校工作是落实国家西部大开发战略的一项重要举措。对口支援推动了我国高等教育的区域协调发展,提升了西部地区高校办学质量,为西部地区培养人才做出了很大贡献。通过对东中西部大学对口支援现状进行解析,总结和分析了九年来东中西部高校对口支援工作的显著成效及成功经验,并就工作中存在的问题提出了对策建议。  相似文献   

The level (pupil, classroom or school) at which an educational intervention is assigned affects both the kinds of questions which can be answered in evaluation research, and the statistical methods used to answer them. This paper sets out ways of analysing different kinds of designs using multilevel models. It also considers practical issues such as the method used to allocate interventions, leakage, integrity of delivery, and cost, and how these interact with the more technical issues of model specification. These practical issues are illustrated by two recent British intervention studies. Résumé Le niveau - élève, classe ou école - auquel s'adresse une intervention éducative affecte tout à la fois le genre de questions qui peuvent trouver réponse dans la recherche évaluative et les méthodes statistiques pour y répondre. Cet article présente des voies pour analyser différents types de "design" qui utilisent des modèles à plusieurs niveaux. Il traite aussi de certains aspects pratiques tels que la méthode d'allocation des interventions, les pertes, l'intégrité des données et les coÛts, ainsi que la manière dont ces différents aspects interagissent avec les questions plus techniques de spécification des modèles. Ces aspects pratiques sont illustrés à l'aide de deux études d'intervention britanniques récentes. (Traduction: Walo Hutmacher, Sociologue, Genève)  相似文献   

This study explored how neighborhood characteristics may relate to African American adolescents' internalizing symptoms via adolescents' social support and perceptions of neighborhood cohesion. Participants included 571 urban, African American adolescents (52% female; M age = 17.8). A multilevel path analysis testing both direct and indirect effects of neighborhood characteristics on adolescents' mental health outcomes was conducted. Higher neighborhood poverty and unemployment rates predicted greater internalizing symptoms via lower cumulative social support and perceptions of neighborhood cohesion. In contrast, higher concentrations of African American and residentially stable residents in one's neighborhood related to fewer internalizing symptoms among adolescent residents via greater cumulative social support and perceptions of neighborhood cohesion. Implications of these findings are discussed.  相似文献   

李清照是我国宋代词坛上一位大家,在她的《词论》里提出了“词别是一家”的著名论断,要求填词必需严格按照协律、铺叙、重典、情致和故实等五个标准进行创作,缺一不可。她提出的作词标准是有见地的,但对前人的词作批评,过于苛求,有失公允,有的甚至是不正确的,就是她的某些词作也没有完全严格按照《词论》的标准去创造。  相似文献   

文章首先回顾了信度和效度的概念以及检测信度和效度的方法,以此为依据,将收集到的电脑评分和专家人工评分的数据进行了相关性分析、信度检验、重复性方差分析、独立样本t检验以及定性分析等各项分析,多方位地多元评分系统的信度和效度进行了验证。结果表明,本系统具有良好的内部一致性,信度较好,但是初评分比例较高时,信度较低;与专家评分的结果对比研究表明,自动评分系统结果对说明文和应用文体两种文体写作能力解释力较差。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号