首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. Automated scoring is the use of computer algorithms to score unconstrained open-ended test items by mimicking human scoring. The use of automated scoring is increasing in educational assessment programs because it allows scores to be returned faster at lower cost. In the module, they discuss automated scoring from a number of perspectives. First, they discuss benefits and weaknesses of automated scoring, and what psychometricians should know about automated scoring. Next, they describe the overall process of automated scoring, moving from data collection to engine training to operational scoring. Then, they describe how automated scoring systems work, including the basic functions around score prediction as well as other flagging methods. Finally, they conclude with a discussion of the specific validity demands around automated scoring and how they align with the larger validity demands around test scores. Two data activities are provided. The first is an interactive activity that allows the user to train and evaluate a simple automated scoring engine. The second is a worked example that examines the impact of rater error on test scores. The digital module contains a link to an interactive web application as well as its R-Shiny code, diagnostic quiz questions, activities, curated resources, and a glossary.  相似文献   

2.
《教育实用测度》2013,26(3):281-299
The growing use of computers for test delivery, along with increased interest in performance assessments, has motivated test developers to develop automated systems for scoring complex constructed-response assessment formats. In this article, we add to the available information describing the performance of such automated scoring systems by reporting on generalizability analyses of expert ratings and computer-produced scores for a computer-delivered performance assessment of physicians' patient management skills. Two different automated scoring systems were examined. These automated systems produced scores that were approximately as generalizable as those produced by expert raters. Additional analyses also suggested that the traits assessed by the expert raters and the automated scoring systems were highly related (i.e., true correlations between test forms, across scoring methods, were approximately 1.0). In the appendix, we discuss methods for estimating this correlation, using ratings and scores produced by an automated system from a single test form.  相似文献   

3.
A framework for evaluation and use of automated scoring of constructed‐response tasks is provided that entails both evaluation of automated scoring as well as guidelines for implementation and maintenance in the context of constantly evolving technologies. Consideration of validity issues and challenges associated with automated scoring are discussed within the framework. The fit between the scoring capability and the assessment purpose, the agreement between human and automated scores, the consideration of associations with independent measures, the generalizability of automated scores as implemented in operational practice across different tasks and test forms, and the impact and consequences for the population and subgroups are proffered as integral evidence supporting use of automated scoring. Specific evaluation guidelines are provided for using automated scoring to complement human scoring for tests used for high‐stakes purposes. These guidelines are intended to be generalizable to new automated scoring systems and as existing systems change over time.  相似文献   

4.
自动作文评分系统的技术优势为英语写作教学模式的创新改革提供一个良好的平台。本研究对基于自动作文评分系统的英语写作教学模式进行了设计与教学实践,包括写前阶段、初稿和同伴互评阶段、修改和自动评阋阶段、课堂讲评和定稿阶段的设计。为期一年的写作教学实验表明:新的写作教学模式督促学生写,保持写作的频率,激发学生的写作兴趣,培养学生自主写作能力,提高学生英语写作水平。  相似文献   

5.
Performance assessments are typically scored by having experts rate individual performances. The cost associated with using expert raters may represent a serious limitation in many large-scale testing programs. The use of raters may also introduce an additional source of error into the assessment. These limitations have motivated development of automated scoring systems for performance assessments. Preliminary research has shown these systems to have application across a variety of tasks ranging from simple mathematics to architectural problem solving. This study extends research on automated scoring by comparing alternative automated systems for scoring a computer simulation test of physicians'patient management skills; one system uses regression-derived weights for components of the performance, the other uses complex rules to map performances into score levels. The procedures are evaluated by comparing the resulting scores to expert ratings of the same performances.  相似文献   

6.
《教育实用测度》2013,26(4):413-432
With the increasing use of automated scoring systems in high-stakes testing, it has become essential that test developers assess the validity of the inferences based on scores produced by these systems. In this article, we attempt to place the issues associated with computer-automated scoring within the context of current validity theory. Although it is assumed that the criteria appropriate for evaluating the validity of score interpretations are the same for tests using automated scoring procedures as for other assessments, different aspects of the validity argument may require emphasis as a function of the scoring procedure. We begin the article with a taxonomy of automated scoring procedures. The presentation of this taxonomy provides a framework for discussing threats to validity that may take on increased importance for specific approaches to automated scoring. We then present a general discussion of the process by which test-based inferences are validated, followed by a discussion of the special issues that must be considered when scoring is done by computer.  相似文献   

7.
8.
近五十年来,国内外相继开发出多个英语作文自动评分系统,研究日臻成熟。在翻译领域,自动评分研究主要局限于机器翻译评价,人工译文自动评分研究仍处于初级阶段。近年国内建立起针对中国学生的汉译英自动评分模型,针对英译汉的自动评分研究也开始起步。由于中国学生的英译汉具有自身的特点,其评分系统在变量挖掘、模型验证等方面与已有研究不同。  相似文献   

9.
纠正性反馈是二语习得研究领域的一个热门话题。它不仅令学习者意识到自身的二语水平和目标语之间的差距,并且提供机会让他们修正自己的语言输出,从而提高自身的语言水平。本研究对本校非英语专业学生对口语教学中教师提供纠正性反馈的态度进行了抽样调查,并分析了学生个体因素对这种态度的形成所产生的影响,以启示教师在口语教学中更为合理有效地运用纠正性反馈。  相似文献   

10.
计算机自动评分(CAS)用于自学考试外语类课程的翻译测验评分,能够有效提高评分效率及客观性。本研究对72名自考学习者翻译测验作答数据的计算机自动评分结果与人工评分结果进行相关分析及配对样本t检验,并将两种评分方式的诊断结果进行比较。研究发现,计算机自动评分与人工评分结果高度相关,两种评分方式的翻译测验总分无显著差异,总体而言本次翻译测验自动评分结果是可靠的;但计算机自动评分与人工评分对自考学习者的翻译能力结构诊断结果有一定差异。  相似文献   

11.
《教育实用测度》2013,26(2):151-169
The use of automated scanning of test sheets, beginning in the 1930s, led to widespread use of the multiple-choice format in standardized testing. New forms of automated scoring now hold out the possibility of making a wide range of constructed-response item formats feasible for use on a large-scale basis. We describe new developments in five domains: mathematical reasoning, algebra problem solving, computer science, architecture, and natural language. For each one, we describe the task as presented to the examinee, the methods used to score the response, and the psychometric properties of the item responses. We then highlight general challenges and issues spanning these technologies. We conclude by offering our views on the ways in which such technologies are likely to shape the future of testing.  相似文献   

12.
ABSTRACT

This article employs the Common European Framework Reference for Language Acquisition (CEFR) as a basis for evaluating writing in the context of machine scoring. The CEFR was designed as a framework for evaluating proficiency levels of speaking for the 49 languages comprising the European Union. The intent was to impact language instruction so that “mastery” of one language has the same meaning as it does in another. A second objective is to provide a crosswalk for what one automated writing evaluation (AWE) system does in attending to the dimensions of the framework. The CEFR Framework is divided into five traits and different proficiency levels. The question then becomes: Does the AWE system attempt to measure these dimensions of writing? And, if so, how is this operationalized? Is it measuring aspects of communication that are not specified? The goal here is to create a common vocabulary between the writing community and those interested in AWE systems as to what is actually being measured by their software, and mapping that to a developmental scale of writing performance.  相似文献   

13.
This article discusses a recent longitudinal study of four, Vietnamese‐speaking 4‐year‐olds' acquisition of English as a second language in a bilingual preschool over 1 year. The research examines the learners' English language output in interaction between the teacher and peers and identifies the key factors which influenced their development of English. A major feature of the learners' data was their dominant use of single words rather than reliance on chunked language. These single words provided the basis for later development of more complex utterances. Commonalities and substantial differences were documented between the learners in terms of the amount of English that was produced, the learners' approaches to interaction and their development of English.  相似文献   

14.
语言迁移存在于第二语言习得的过程中,因此,汉语方言对于英语语音的学习存在迁移作用,包括声母、韵母和汉语相拼规律对英语语音学习的影响。本文以济南、成都和萍乡方言为例,比较了它们与英语之间不同的语音系统,探究汉语方言对英语语音的迁移作用的表现,并提出改进英语语音的建议。  相似文献   

15.
《Cultura y Educación》2013,25(2):141-156
Abstract

In this article the authors raise their concern about the way in which students with different levels of competence in the language of instruction (English in this case) acquire a second language. They also address the issue of the types of conditions which are necessary in order for dialogic interaction to be effective in the classroom when these students are acquiring a second language

In the first section, a review of the different theories on language learning is carried out, which helps to provide a response to the questions which have been raised. In the second section, the need to create opportunities to use the second language in real and meaningful situations is also raised, based on the results of three case studies. Learning through dialogue is presented as being a much more effective approach to teaching and learning a second language than traditional approaches, which were based on the teacher merely providing instruction and subsequently developing the information.  相似文献   

16.
《Higher Education Policy》2001,14(4):293-312
This analysis, based on open interviews, focuses on the language of instruction, with full time students in Hong Kong universities. A mis-match was apparent between espoused theory or policy and theory in use, or practice, with respect to English as the medium of instruction, English as a second language and improvement of English standard at university. Student comments suggested that the discrepancies between policy and practice led to declining standards due to the lack of opportunities for practice. Bringing policy in line with practice, is recommended either by adopting first language instruction or by creating zones or occasions where it is expected that English will be used.  相似文献   

17.
本文以135名高职英语专业学生为研究对象,采用问卷调查的方法调查高职英语专业学生英语写作策略使用情况。研究表明:补偿策略和记忆策略的使用频率最高,不同年级、性别的受试者在写作策略使用上没有显著性差异,高分组与中分组和低分组之间存在显著性差异,但其频率均在"使用情况一般"的范畴内,只有元认知写作策略进入回归方程,对英语成绩有一定的预测作用。因此,高职英语专业学生的英语写作策略亟待提高。  相似文献   

18.
Sound symbolism is the notion that there is a subset of words in the world’s languages for which sounds and their symbols have some degree of correspondence. Two studies assessed 5th and 6th graders’ knowledge of word meanings for English sound symbolic and non-sound symbolic words. Both studies found that the meanings of sound symbolic words were guessed more often than those for non-sound symbolic words. Study 1 found this for words presented in isolation and for both native speakers of English and those learning English as a second language. Study 1 also found that there was no difference in the ability to use sound symbolic word information between these two participant groups. Study 2 found superior performance on sound symbolic words presented both in isolation and in context and found that the combination these two types of information yielded greater word learning than either alone. We conclude that sound symbolism is a word property which influences the learning of unknown words.  相似文献   

19.
Early childhood classrooms characterized by a predominance of second language learners from a wide mix of language backgrounds have emerged in unprecedented numbers on the American urban scene, lending urgency to the question: What happens when such diverse language learners are increasingly each others' only available peer resource for language learning in the classroom? This article highlights peer support for “getting into English” in one such setting—an ESL kindergarten where the children came from eight different language backgrounds and the teacher was the only native English speaker. Over 6 months of participant observation using qualitative, sociolinguistic methodology, the researcher documented the range of contexts in which case study children came to use English, and the efforts they and their teacher made to understand each other and be understood. In informal contexts for peer talk, children of like and different backgrounds served as resources for each other's use of English. They helped each other begin to use English among a broadening network of peers, for an expanding variety of purposes, and—at least to some extent—with more precision. Moreover, they pushed each other to elaborate and clarify their English. A model of peer collaboration is explored to take into account how children's evolving social relationships served as an impetus for talk in English. Implications for research and practice are discussed.  相似文献   

20.
In this article I link 3 areas that have recently attracted (renewed) interest in second language acquisition (SLA) and applied linguistics research: (a) first language (L1) use in adult foreign language study; (b) adult second language (L2) play; and (c) adult language learner identity. In mainstream approaches to SLA and utilitarian conceptualizations of foreign language learning, L1 use typically is considered detrimental to L2 acquisition; L2 play is a superfluous activity that detracts from the serious business of language learning; analyst-sensitive examinations of learner-internal mechanisms in the process of SLA are emphasized over learner-sensitive studies of the language learner's identity in sociocultural context. Recent work, however, has brought new evaluations of these 3 areas. Antón and DiCamilla (1998) have found that L1 use may function as an advantageous metacognitive tool in L2 acquisition. Both Lantolf (1997, 2000) and Tarone (2000) have recently advanced theories of form-based L2 play in which play functions in the acquisition of L2 forms. Norton (2000) has investigated the importance of learners' social identity in L2 learning and use in the context of Canadian immigration. In this article, I suggest a new role for adult form-based L2 play in SLA theorizing. This type of play may not only aid in the acquisition of L2 forms but may also serve as a textual icon for learners' growing multicompetence (i.e., the distinct state of mind with 2 or more grammars; V. Cook, 1991, 1992). Multicompetence may have meaning for learners' sense of self and their ways of interacting with the world. Multilingual form-based play with language names (e.g., "English" and "German") and syntax in the written texts of advanced tutored learners of German are examined.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号