首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
使用潜语义分析的汉语作文自动评分研究   总被引:3,自引:0,他引:3  
本研究是第一个使用潜语义分析技术对汉语作文进行计算机自动评分的研究。首先对202篇高中作文进行人工评分,两位评分员评分相关为0.62。然后使用潜语义分析评价作文,先得到内容分数,其和人工评价内容分的相关达到了0.47;再使用该内容分对总分进行回归,回归方程的决定系数为0.3,回归所得总分和人工评价总分的相关达到了0.55。本研究表明,潜语义分析技术在汉语作文自动评分中起着重要作用。进一步的研究需要寻找更多的指标,并辅以其他方法来提高评分效果。  相似文献   

2.
3.
目前在我国的大学英语写作的教学方面,正在逐步推行把以句酷批改网为代表的在线写作自动评改系统引入教学中的改革措施。由于在线写作自动评改系统对软硬件和教师的使用技术要求较高,且还有一些非人为评改方面的缺陷,致使一些院校在实施改革时顾虑重重。本文着重阐明了在线写作自动评改系统对大学英语写作教学多方面的有力辅助作用,并就其更深远的教育和社会意义做出了分析。希望能使更多的教师投入到这一教学改革中来。  相似文献   

4.
5.
自动作文评分系统的技术优势为英语写作教学模式的创新改革提供一个良好的平台。本研究对基于自动作文评分系统的英语写作教学模式进行了设计与教学实践,包括写前阶段、初稿和同伴互评阶段、修改和自动评阋阶段、课堂讲评和定稿阶段的设计。为期一年的写作教学实验表明:新的写作教学模式督促学生写,保持写作的频率,激发学生的写作兴趣,培养学生自主写作能力,提高学生英语写作水平。  相似文献   

6.
陈芸 《鸡西大学学报》2012,(10):102-104
随着现代教育技术的发展,自动作文评分系统在英语写作教学中的应用越来越多。为了解自动作文评分系统在英语写作教学中的效果,在非英语专业的两个班级开展了共14周的对比教学实验。同时,以调查问卷的方式调查了学生对自动作文评分系统的看法。结果表明,基于自动作文评分系统的写作教学模式,较之于传统写作教学模式,更能促进学生英语写作能力的提高。研究还发现,应用自动作文评分系统过程中学生的写作策略亟需提高,教师的引导不容忽略。  相似文献   

7.
赵慧  唐建敏 《教育技术导刊》2019,18(11):168-171
英语写作能力培养一直是大学英语教学的重点和难点,目前自动作文评分AES(Automated Essay Scoring)技术已得到广泛应用,但如何将其与大学英语写作教学有效结合仍有待深入研究。鉴于此,根据我国大学英语写作教学现状,结合L2(Second Language)语言学习特点,在分析AES技术相关原理基础上,对大学英语写作教学模式进行分析研究。结果表明,当前中国大学英语写作教学需结合AES技术和L2语言学习特点,构建基于AES的大学英语教学模式,以激发学生学习兴趣,提升学生英语写作能力。  相似文献   

8.
ABSTRACT

Automated essay scoring is a developing technology that can provide efficient scoring of large numbers of written responses. Its use in higher education admissions testing provides an opportunity to collect validity and fairness evidence to support current uses and inform its emergence in other areas such as K–12 large-scale assessment. In this study, human and automated scores on essays written by college students with and without learning disabilities and/or attention deficit hyperactivity disorder were compared, using a nationwide (U.S.) sample of prospective graduate students taking the revised Graduate Record Examination. The findings are that, on average, human raters and the automated scoring engine assigned similar essay scores for all groups, despite average differences among groups with respect to essay length and spelling errors.  相似文献   

9.
本研究通过实验探索人工智能评测技术在人工网上评卷质量监控中的应用及其他相关应用。实验数据采集自2017年安徽省高考语文作文和英语作文共计841 610份试卷,对智能阅卷产生的机器评分、普通高考人工网上评卷产生的人工1评和人工2评以及报道分进行平均分、标准差、相关度、评分一致率等多个维度的数据分析,将智能阅卷产生的异常作答样本和大分差数据样本反馈给学科专家组进行质检评分。结果表明:智能阅卷基本上达到了与评卷教师相当的水平;智能阅卷始终采用统一的评分标准,更具客观公正性,能为人工网上评卷提供有效的质量监控。  相似文献   

10.
英语作文自动评分及其效度、信度与可操作性探讨   总被引:2,自引:0,他引:2  
评述国内外作文自动评分系统,并依据英语作文测试中的信度、效度和实践可操作性对其进行分析。探讨国内英语作文自动评阅系统的发展,在肯定其优点的同时,指出和分析其中的问题和不足,并提出相应之对策,以期为我国英语作文自动评阅系统研发提供借鉴和启迪。  相似文献   

11.
《现代教育技术》2019,(2):66-71
随着教育信息化的不断深入,作文自动批改系统日趋成熟。文章提出了作文自动批改系统辅助大学英语写作慕课的交互式教学模式,并验证了该模式的有效性。研究表明,该模式具有以下价值:①可以提高学生的写作水平;②使学生的写作策略和技巧得到提高;③在提高写作水平的同时,学生的汉译英翻译能力也得到了发展;④学生的英语学习兴趣得到激发,养成了良好的阅读习惯;⑤更加重视对语篇知识的积累。同时,文章通过作文自动批改系统辅助大学英语写作慕课的交互式教学模式,强调了写作实践的重要作用,突出了写作过程中环环相扣的教学环节,明确了教学重点和教学步骤。  相似文献   

12.
Automated computerized scoring systems (ACSSs) are being increasingly used to analyze text in many educational settings. Nevertheless, the impact of misspelled words (MSW) on scoring accuracy remains to be investigated in many domains, particularly jargon-rich disciplines such as the life sciences. Empirical studies confirm that MSW are a pervasive feature of human-generated text and that despite improvements, spell-check and auto-replace programs continue to be characterized by significant errors. Our study explored four research questions relating to MSW and text-based computer assessments: (1) Do English language learners (ELLs) produce equivalent magnitudes and types of spelling errors as non-ELLs? (2) To what degree do MSW impact concept-specific computer scoring rules? (3) What impact do MSW have on computer scoring accuracy? and (4) Are MSW more likely to impact false-positive or false-negative feedback to students? We found that although ELLs produced twice as many MSW as non-ELLs, MSW were relatively uncommon in our corpora. The MSW in the corpora were found to be important features of the computer scoring models. Although MSW did not significantly or meaningfully impact computer scoring efficacy across nine different computer scoring models, MSW had a greater impact on the scoring algorithms for naïve ideas than key concepts. Linguistic and concept redundancy in student responses explains the weak connection between MSW and scoring accuracy. Lastly, we found that MSW tend to have a greater impact on false-positive feedback. We discuss the implications of these findings for the development of next-generation science assessments.  相似文献   

13.
A framework for evaluation and use of automated scoring of constructed‐response tasks is provided that entails both evaluation of automated scoring as well as guidelines for implementation and maintenance in the context of constantly evolving technologies. Consideration of validity issues and challenges associated with automated scoring are discussed within the framework. The fit between the scoring capability and the assessment purpose, the agreement between human and automated scores, the consideration of associations with independent measures, the generalizability of automated scores as implemented in operational practice across different tasks and test forms, and the impact and consequences for the population and subgroups are proffered as integral evidence supporting use of automated scoring. Specific evaluation guidelines are provided for using automated scoring to complement human scoring for tests used for high‐stakes purposes. These guidelines are intended to be generalizable to new automated scoring systems and as existing systems change over time.  相似文献   

14.
文章首先回顾了信度和效度的概念以及检测信度和效度的方法,以此为依据,将收集到的电脑评分和专家人工评分的数据进行了相关性分析、信度检验、重复性方差分析、独立样本t检验以及定性分析等各项分析,多方位地多元评分系统的信度和效度进行了验证。结果表明,本系统具有良好的内部一致性,信度较好,但是初评分比例较高时,信度较低;与专家评分的结果对比研究表明,自动评分系统结果对说明文和应用文体两种文体写作能力解释力较差。  相似文献   

15.
ABSTRACT

As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation. The purpose of this study was to assess the validity, labor costs, and efficiency of comparative judgments as a potential substitute for rubric scoring. An analysis of two essay prompts revealed that comparative judgment measures were comparable to rubric scores at a level similar to that expected of two professional scorers. The comparative judgment measures correlated slightly higher than rubric scores with a multiple-choice writing test. Score reliability exceeding .80 was achieved with approximately nine judgments per response. The average judgment time was 94 seconds, which compared favorably to 119 seconds per rubric score. Practical challenges to future implementation are discussed.  相似文献   

16.
几个英语作文自动评分系统的原理与评述   总被引:8,自引:0,他引:8  
本文介绍目前美国在大规模考试和英语教学中最为流行的几个作文自动评分系统的基本原理并对这些系统进行简单的评述。所涉及的系统包括Project Essay Grader(PEG),Intelligent Essay Assessor (IEA),E-rater和Criterion,IntelliMetric和MY Access!,Bayesian Essay Test Scoring System(BETSY)。  相似文献   

17.
'Mental models' used by automated scoring for the simulation divisions of the computerized Architect Registration Examination are contrasted with those used by experienced human graders. Candidate solutions (N = 3613) received both automated and human holistic scores. Quantitative analyses suggest high correspondence between automated and human scores; thereby suggesting similar mental models are implemented. Solutions with discrepancies between automated and human scores were selected for qualitative analysis. The human graders were reconvened to review the human scores and to investigate the source of score discrepancies in light of rationales provided by the automated scoring process. After review, slightly more than half of the score discrepancies were reduced or eliminated. Six sources of discrepancy between original human scores and automated scores were identified: subjective criteria; objective criteria; tolerances/ weighting; details; examinee task interpretation; and unjustified. The tendency of the human graders to be compelled by automated score rationales varied by the nature of original score discrepancy. We determine that, while the automated scores are based on a mental model consistent with that of expert graders, there remain some important differences, both intentional and incidental, which distinguish between human and automated scoring. We conclude that automated scoring has the potential to enhance the validity evidence of scores in addition to improving efficiency.  相似文献   

18.
Research on Automated Essay Scoring has become increasing important because it serves as a method for evaluating students’ written responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written-response assessments. The purpose of this study is to describe and evaluate three active learning methods that can be used to minimize the number of essays that must be scored by human raters while still providing the data needed to train a modern Automated Essay Scoring system. The three active learning methods are the uncertainty-based, the topological-based, and the hybrid method. These three methods were used to select essays included in the Automated Student Assessment Prize competition that were then classified using a scoring model that was trained with the bidirectional encoder representations from a transformer language model. All three active learning methods produced strong results, with the topological-based method producing the most efficient classification. Growth rate accuracy was also evaluated. The active learning methods produced different levels of efficiency under different sample size allocations but, overall, all three methods were highly efficient and produced classifications that were similar to one another.  相似文献   

19.
Scientific argumentation is one of the core practices for teachers to implement in science classrooms. We developed a computer-based formative assessment to support students’ construction and revision of scientific arguments. The assessment is built upon automated scoring of students’ arguments and provides feedback to students and teachers. Preliminary validity evidence was collected in this study to support the use of automated scoring in this formative assessment. The results showed satisfactory psychometric properties related to this formative assessment. The automated scores showed satisfactory agreement with human scores, but small discrepancies still existed. Automated scores and feedback encouraged students to revise their answers. Students’ scientific argumentation skills improved during the revision process. These findings provided preliminary evident to support the use of automated scoring in the formative assessment to diagnose and enhance students’ argumentation skills in the context of climate change in secondary school science classrooms.  相似文献   

20.
本文采用TF-IDF算法以及余弦相似度的思想,提出了一种大批量英文作文的评分系统。首先以提高大批量英语作文评分效率为出发点,介绍了英文文本处理现状及机器自动评分发展现状。其次对如何实现机器评分做出了详细的介绍。最后将机器评分结果与人工打分结果相对比,验证机器评分可行性,评价其优缺点,并对其未来发展做出展望。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号