首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
评分标准在写作测试中非常重要,使用不同的评分方法会影响评卷者的评分行为。研究显示,虽然整体法和分析法两种英语写作评分方法都可靠,但是在两种评分中,评卷者的严厉程度以及考生的写作成绩发生很大变化。总体上,整体法评分中,评卷者的严厉程度趋于一致,接近理想值;分析法评分中,考生的写作成绩更高,同时评卷者的严厉程度也存在显著差异。因而,在决定考生前途命运的重大考试中,整体评分法更受推崇。  相似文献   

Raters of Georgia's (USA) state-mandated college-level writing exam, which is intended to ensure a minimal university-level writing competency, are trained to grade holistically when assessing these exams. A guiding principle in holistic grading is to not focus exclusively on any one aspect of writing but rather to give equal weight to style, vocabulary, mechanics, content, and development. This study details how raters react to “errors” typical of African American English writers, of ESL writers, and of standard American English writers. Using a log-linear model to generate odds ratios for comparison of essays with these error types, results indicate linguistic discrimination against African American “errors” and a leniency for ESL errors in writing assessment.  相似文献   

This study investigates how experienced and inexperienced raters score essays written by ESL students on two different prompts. The quantitative analysis using multi-faceted Rasch measurement, which provides measurements of rater severity and consistency, showed that the inexperienced raters were more severe than the experienced raters on one prompt but not on the other prompt, and that differences between the two groups of raters were eliminated following rater training. The qualitative analysis, which consisted of analysis of raters' think-aloud protocols while scoring essays, provided insights into reasons for these differences. Differences were related to the ease with which the scoring rubric could be applied to the two prompts and to differences in how the two groups of raters perceived the appropriateness of the prompts.  相似文献   

Many states are implementing direct writing assessments to assess student achievement. While much literature has investigated minimizing raters' effects on writing scores, little attention has been given to the type of model used to prepare raters to score direct writing assessments. This study reports on an investigation that occurred in a state-mandated writing program when a scoring anomaly became apparent once assessments were put in operation. The study indicates that using a spiral model for training raters and scoring papers results in higher mean ratings than does using a sequential model for training and scoring. Findings suggest that making decisions about cut-scores based on pilot data has important implications for program implementation.  相似文献   

主观题是语言测试中的重要组成部分。主观题可以弥补标准化试题的不足,但又存在评分依赖于评分员主观印象的问题,这导致评分员自身的不稳定性和评分员之间的差异。借鉴、利用三大测量理论和计算机辅助评分,可以优化主观题评分质量,提高其精准性和有效性。  相似文献   

在时中国古代文学作品进行分析研究的惯例中,我们通常会研究古代作家们所处的历史、时代背景、生活境遇和在这些影响下所形成的性格.不同的人在不同的外力影响下会形成不同的创作风格,即便两位作家所写的题材相似,也会给人不同的审美感受.本文以贾岛与李商隐的各一诗作为例,通过对比分析来凸显同一题材在不同诗人的审美观照和艺术表现下所富有的非同质美.  相似文献   

Using generalizability (G-) theory and rater interviews as research methods, this study examined the impact of the current scoring system of the CET-4 (College English Test Band 4, a high-stakes national standardized EFL assessment in China) writing on its score variability and reliability. One hundred and twenty CET-4 essays written by 60 non-English major undergraduate students at one Chinese university were scored holistically by 35 experienced CET-4 raters using the authentic CET-4 scoring rubric. Ten purposively selected raters were further interviewed for their views on how the current scoring system could impact its score variability and reliability. The G-theory results indicated that the current single-task and single-rater holistic scoring system would not be able to yield acceptable generalizability and dependability coefficients. The rater interview results supported the quantitative findings. Important implications for the CET-4 writing assessment policy in China are discussed.  相似文献   

We examined how raters and tasks influence measurement error in writing evaluation and how many raters and tasks are needed to reach a desirable level of .90 and .80 reliabilities for children in Grades 3 and 4. A total of 211 children (102 boys) were administered three tasks in narrative and expository genres, respectively, and their written compositions were evaluated in widely used evaluation methods for developing writers: holistic scoring, productivity, and curriculum-based writing scores. Results showed that 54 and 52% of variance in narrative and expository compositions were attributable to true individual differences in writing. Students’ scores varied largely by tasks (30.44 and 28.61% of variance), but not by raters. To reach the reliability of .90, multiple tasks and raters were needed, and for the reliability of .80, a single rater and multiple tasks were needed. These findings offer important implications about reliably evaluating children’s writing skills, given that writing is typically evaluated by a single task and a single rater in classrooms and even in some state accountability systems.  相似文献   

对英语作为第二语言(ESL)的写作教学来说,"以过程为中心"的方法较诸"以产品为中心"的方法可能更为有效.作为一种写作教学方法,"过程法"注重写作的过程,因而在实施中特别重视写作过程中的不同阶段,并就各阶段设置了多种多样的练习活动,以使学生写出更有意义的作品.然而,我们不能将"过程法"降为一种具有规定技巧和惯例的"办法",而应创设有效的写作学习环境,在这种环境中,学生不仅对写作感到轻松愉快,而且能自主探索并培育个性化的写作方法.  相似文献   

The purpose of this study was to examine the quality assurance issues of a national English writing assessment in Chinese higher education. Specifically, using generalizability theory and rater interviews, this study examined how the current scoring policy of the TEM-4 (Test for English Majors – Band 4, a high-stakes national standardized EFL assessment in China) writing could impact its score variability and reliability. Eighteen argumentative essays written by nine English major undergraduate students were selected as the writing samples. Ten TEM-4 raters were first invited to use the authentic TEM-4 writing scoring rubric to score these essays holistically and analytically (with time intervals in between). They were then interviewed for their views on how the current scoring policy of the TEM-4 writing assessment could affect its overall quality. The quantitative generalizability theory results of this study suggested that the current scoring policy would not yield acceptable reliability coefficients. The qualitative results supported the generalizability theory findings. Policy implications for quality improvement of the TEM-4 writing assessment in China are discussed.  相似文献   


The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are used to calibrate and test AES systems; (b) which human raters provided the scores on these essays; and (c) given that multiple human raters are generally used for this purpose, which human scores should ultimately be used when there are score disagreements? This article provides commentary on the first two questions and an empirical investigation into the third question. The authors suggest that addressing these three questions strengthens the scoring component of the validity argument for any assessment that includes AES scoring.  相似文献   

《淮南子》认为真人、至人和圣人分别为上世、中世和末世的理想人格。这些理想人格是不同时代的产物,各具特征,但也有着共性,就是他们都清心寡欲,清静无为,注重养心和养性。真人、至人尤其是圣人的这种人格特征和精神追求,对于当今的官员廉政和和谐社会的构建,具有重要意义。  相似文献   

The purpose of the study is to investigate rating behavior between Korean and native English speaking (NES) raters. Five Korean English teachers and five NES teachers graded 420 essays written by Korean college freshmen and completed survey questionnaires. The grading data were analyzed with FACETS program. The results revealed Korean raters’ inferiority in measuring linguistic components. Furthermore, the Korean raters were more severe in scoring grammar, sentence structure, and organization, whereas the NES raters were stricter toward content and overall scores. In addition, the analysis of the raters’ responses on survey discovered that the NNS raters’ perception spread into content and grammar as the most difficult feature to grade, while all NES raters thought content as the most difficult. Based on these research findings, future research suggestions and implications are discussed.  相似文献   

The decision-making behaviors of 8 raters when scoring 39 persuasive and 39 narrative essays written by second language learners were examined, first using Rasch analysis and then, through think aloud protocols. Results based on Rasch analysis and think aloud protocols recorded by raters as they were scoring holistically and analytically suggested that rater background may have contributed to rater expectations that might explain individual differences in the application of the performance criteria of the rubrics when rating essays. The results further suggested that rater ego engagement with the text and/or author may have helped mitigate rater severity and that self-monitoring behaviors by raters may have had a similar mitigating effect.  相似文献   

This study examined the nature and frequency of error in high school native English speaker (L1) and English learner (L2) writing. Four main research questions were addressed: Are there significant differences in students’ error rates in English language arts (ELA) and social studies? Do the most common errors made by students differ in ELA and social studies? Are there significant differences in the error rates between L1 and L2 students in ELA? Do L1 and L2 students differ in how frequently they make the most common errors in ELA? Written work of 10th and 12th grade students in five states was collected. The sample included 178 essays (120 in ELA and 58 in social studies) from 67 students (33 10th graders and 34 12th graders; 49 native English speaking students and 18 English learners). Results indicate that there were significant differences in the frequencies of errors between ELA and social studies, with higher error rates in social studies. In addition, L2 writers had significantly higher error rates than L1 writers in ELA. Aside from a few types of errors (spelling, capitalization, and some punctuation errors), most types of errors appear relatively infrequently in school-sponsored writing. Moreover, the eight most common errors accounted for a little more than half of all errors, and these did not differ significantly between ELA and social studies writing or between L1 and L2 writers.  相似文献   

进入新世纪以后,能够引起关注的散文作品不多,散文创作渐趋于平淡。主要原因是文学理论界关于散文的认识陷入困惑,发展失去方向,大多数散文作家不敢介入社会生活,也不知道为什么要写散文。只有上述问题得到解决,才能促使新世纪的散文创作再度辉煌。  相似文献   

随着知识经济时代的到来 ,散文观念的变化 ,散文艺术的开放 ,当代各种艺术信息的渗透 ,散文审美风格也由过去的单一模式向多元化转变。从作家的个人化写作到社会化写作的变化过程 ,作家的审美主体意识进一步增强 ,表现出不同作家对不同的审美艺术的追求与开拓 ,形成了斑斓多姿的散文审美风格 ,这种审美风格主要体现在语言风格上。散文的语言怎样才能体现其独特的风格美 ?需要从以下三方面着手 :一是要提高运用语言的功力 ,力求语言的个性化 ;二是要贴近生活 ,贴近作家的本色和个性 ,力求语言的自然之美 ;三是要避熟就生 ,力求语言的陌生化。  相似文献   

散文是一种最为开放的文体,它注重个体生命体验的直接表达,以短小精悍的方式传递出一种能感动人心的大情怀。好散文的精神特质有两点,其一,散文创作者最终要形成自己的、能够显著地区别于他人的创作风格,如史铁生与人生、命运抗争,李泽厚对历史进行隐性批判的写作风格;其二,要有突破性的视角关注社会,如周国平对"自我"的表达、贾平凹对城乡差别以及社会伦理角色冲突的关注。两种精神特质也是两种创作方法,散文创作者只有结合自身生命体验将二者灵活运用并一以贯之,才能写出精品散文,写出好散文。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号