期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Using summative and formative assessments to evaluate EFL teachers’ teaching performance

Wei Wei 《Assessment & Evaluation in Higher Education》2015,40(4):611-623

Using classroom observations (formative) and student course experience survey results (summative) to evaluate English lecturers’ teaching performances is not new in practice, but surprisingly only a few studies have investigated this issue in a higher education context. This study was conducted in an English department of a large university in Vietnam. The data include: (1) semi-structured interviews with all the full-time lecturers, (2) two department heads and (3) course experience surveys from English as a foreign language (EFL) students (N?=?2886). Three lessons can be learned: (1) formative assessments do not seem to have an effect on promoting better teaching practices when their feedback is not helpful in improving high-stakes summative assessment results, (2) without sharing a common definition of good teaching practices among assessors, summative assessments appear to make the feedback from formative assessments less meaningful and applicable, and (3) as a result, the combination of formative and summative assessments tends to make EFL lecturers’ self-assessment practices less effective. 相似文献

2.

Sean Kearney Timothy Perkins Shannon Kennedy-Clark 《Assessment & Evaluation in Higher Education》2016,41(6):840-853

The purpose of this paper is to provide a proof of concept of a collaborative peer-, self- and lecturer assessment processes. The research presented here is part of an ongoing study on self- and peer assessments in higher education. The authentic assessment for sustainable learning (AASL) model is evaluated in terms of the correlations between sets of marks. The article provides an explanation of the assessment process, and analyses sets of marks as a means of justifying the validity of the process. The results suggest that students, even those with no prior experience in peer- or self-evaluation, in their first year of tertiary study, under the right conditions, are able to accurately judge their own work and make reasonably accurate judgements of the work of their peers. While previous studies have expounded the benefits of self- and peer assessments in tertiary study, undertaking a prescribed process, such as AASL, has a further implication in allowing others to replicate the process with reasonable assuredness of the validity of the process across various fields of study. 相似文献

3.

Suzanne Lane 《Educational Measurement》2004,23(3):6-14

The validity of high-stakes assessments and accountability systems is discussed in relation to the requirements of No Child Left Behind (NCLB). The extent to which content standards and assessments are cognitively rich, the challenges in setting performance standards, and the impact of high-stakes assessments on instruction and student learning are addressed. The article argues for quality content standards, cognitively rich assessments, and a cohesive, balanced assessment system. 相似文献

4.

Validating performance assessments: measures that may help to evaluate students’ expertise in ‘doing science’

Pitt Hild Christoph Gut Maja Brückmann 《Research in Science & Technological Education》2013,31(4):419-445

相似文献

5.

高考英语全国卷与各省市自主命题卷阅读理解试题内容效度分析 总被引：3，自引：0，他引：3

辜向东王秋艳《考试研究》2008,(3):102-114

本文根据效度研究的有关理论,结合《全日制高级中学英语教学大纲(试验修订版)》、《普通高中英语课程标准(实验)》和2004—2006年的高考英语考试大纲,对这三年的高考英语阅读理解试题进行内容效度分析,结果表明:阅读理解的篇章体裁和题材比较多样化,体裁以说明文为主,部分试卷出现了应用文和描写文;题材以社会与文化为主,但其比重有下降趋势,科普类文章随后,比重略有增加;阅读量和生词量与大纲要求基本相符;考查技能中简单推断占的比例最大。分析中也发现部分选材和题项设计存在问题,对此,本文对阅读理解试题的命制和高中英语教学提出了相关建议。相似文献

6.

谈信度、效度与学业测试

包威《黑龙江教育学院学报》2010,29(8):29-30

信度与效度是学业测试的两个质量特征,如何处理两者之间的关系也是测试的根本问题。在介绍信度和效度的定义、关系的基础上,对学业测试中的信度与效度进行分析,并且阐述如何平衡两者之间的关系。最终证明学业测试是一种有效的测量手段,并且必将提高教学质量。相似文献

7.

Mean Effects of Test Accommodations for ELLs and Non‐ELLs: A Meta‐Analysis of Experimental Studies

Maria Pennock‐Roman Charlene Rivera 《Educational Measurement》2011,30(3):10-28

The objective was to examine the impact of different types of accommodations on performance in content tests such as mathematics. The meta‐analysis included 14 U.S. studies that randomly assigned school‐aged English language learners (ELLs) to test accommodation versus control conditions or used repeated measures in counter‐balanced order. Individual effect sizes (Glass's d) were calculated for 50 groups of ELLs and 32 groups of non‐ELLs. Individual effect sizes for English language and native language accommodations were classified into groups according to type of accommodation and timing conditions. Means and standard errors were calculated for each category. The findings suggest that accommodations that require extra printed materials need generous time limits for both the accommodated and unaccommodated groups to ensure that they are effective, equivalent in scale to the original test, and therefore more valid owing to reduced construct‐irrelevant variance. Computer‐administered glossaries were effective even when time limits were restricted. Although the Plain English accommodation had very small average effect sizes, inspection of individual effect sizes suggests that it may be much more effective for ELLs at intermediate levels of English language proficiency. For Spanish‐speaking students with low proficiency in English, the Spanish test version had the highest individual effect size (+1.45). 相似文献

8.

Inga Arffman 《Educational Measurement》2013,32(2):2-14

The article reviews research and findings on problems and issues faced when translating international academic achievement tests. The purpose is to draw attention to the problems, to help to develop the procedures followed when translating the tests, and to provide suggestions for further research. The problems concentrate on the following: the unique and demanding purpose of the translation task, the partly contradictory task specifications and translation instructions, the indecision as to whether to produce one or two target versions, the indecision as to whether to use one or two source versions, inadequate revision and verification, deficient translator competences, and a lack of time. To solve the problems, the article suggests the following: ensuring that the translation guidelines provide a right, unequivocal, and balanced picture of the purpose of the translation task; ensuring the equivalence of the two source versions; putting more emphasis on revision, and ensuring that the verification is sufficiently thorough; using only qualified translators, providing them with training in test translation, and including also subject matter and testing specialists in the translation teams; and allotting sufficient time to the translation work. However, the main lesson from the review is that more research in the field is badly needed. 相似文献

9.

《Journal of Further & Higher Education》2012,36(1):57-65

This article examines the attempt of one college (Orpington College in Kent) to increase the participation in further education of adult students who would not normally have benefited from post-school education and/or training; effectively non-traditional learners. The college targeted a specific and defined community, the residents of which had, prior to this initiative, limited access to continuing education. This example is by no means a unique one, but it is unusual and reflects what is best termed a community-based approach to widening participation, harnessing the support of a wide range of community, statutory and voluntary groups to ensure its success. 相似文献

10.

Caroline Wakefield James Adie Edd Pitt Tessa Owens 《Assessment & Evaluation in Higher Education》2014,39(2):253-262

Owing to the increasing diversity of assessments in higher education, feedback should be provided to students in a format that can assist future and alternative work. This study aimed to assess the effectiveness of the Essay Feedback Checklist on future alternative assessments. Participants were assigned to one of two groups, one of which completed the checklist prior to assessment 1 (essay) and received feedback using this method. Attainment on assessment 1 and assessment 2 (examination) were taken as pre- and post-test scores. Results revealed increased assessment scores for the checklist group, compared to those who received conventional feedback. Focus group data indicated that students particularly liked elements of the checklist as a feedback method, but potential drawbacks were also highlighted. Implications and future use of the checklist is then discussed. 相似文献

11.

Edith Kealey 《Journal of Teaching in Social Work》2013,33(1):64-74

Despite a wealth of tacit knowledge in academia regarding effective teaching strategies and a rich theoretical and empirical knowledge base on student learning, social work instructors wishing to identify appropriate ways to measure teaching and learning have little evidence to guide them. This article presents a framework for assessment of student learning and evaluation of instructor teaching that distinguishes between formative methods, which support an ongoing process of improvement, and summative methods, which represent a measure of competence or mastery. While summative methods are often used to meet institutional or programmatic goals, formative methods bridge assessment and evaluation and can result in a more reflective, constructive, and productive experience for both instructors and students. 相似文献

12.

‘Finally studying for myself’ – examining student agency in summative and formative self-assessment models

Juuso Henrik Nieminen Laura Tuohilampi 《Assessment & Evaluation in Higher Education》2020,45(7):1031-1045

Abstract

Promoting student agency has been seen as the primary function for new generation assessment environments. In this paper, we introduce two models of self-assessment as a way to foster students’ sense of agency. A socio-cultural framework was utilised to understand the interaction between student agency and self-assessment. Through a comparative design, we investigated whether formative self-assessment and summative self-assessment, based on self-grading, would offer students different affordances for agency. The results show that while both models offered affordances for agentic learning, future-driven agency was only presented by the students studying according to the summative model. Our results shed light on the interplay of student agency and self-assessment in higher education. 相似文献

13.

Martin Bush 《Assessment & Evaluation in Higher Education》2015,40(2):218-231

The humble multiple-choice test is very widely used within education at all levels, but its susceptibility to guesswork makes it a suboptimal assessment tool. The reliability of a multiple-choice test is partly governed by the number of items it contains; however, longer tests are more time consuming to take, and for some subject areas, it can be very hard to create new test items that are sufficiently distinct from previously used items. A number of more sophisticated multiple-choice test formats have been proposed dating back at least 60?years, many of which offer significantly improved test reliability. This paper offers a new way of comparing these alternative test formats, by modelling each one in terms of the range of possible test taker responses it enables. Looking at the test formats in this way leads to the realisation that the need for guesswork is reduced when test takers are given more freedom to express their beliefs. Indeed, guesswork is eliminated entirely when test takers are able to partially order the answer options within each test item. The paper aims to strengthen the argument for using more sophisticated multiple-choice test formats, especially for high-stakes summative assessment. 相似文献

14.

Anders Jonsson Lotta Leden 《International Journal of Science Education》2013,35(14):1926-1943

ABSTRACT

Tests convey messages about what to teach and how to assess. Both of these dimensions may either broaden or become more uniform and narrow as a consequence of high-stakes testing. This study aimed to investigate how Swedish science teachers were influenced by national, high-stakes testing in science, specifically focusing on instances where teachers’ pedagogical practices were broadened and/or narrowed. The research design is qualitative thematic analysis of focus group data, from group discussions with Swedish science teachers. The total sample consists of six teachers, who participated in 12 focus group discussion during three consecutive years. Findings suggest that the national tests influence teachers' pedagogical practice by being used as a substitute for the national curriculum. Since the teachers do not want their students to fail the tests, they implement new content that is introduced by the tests and thereby broaden their existing practice. However, when this new content is not seen as a legitimate part of teachers' established teaching traditions, the interpretation and implementation of this content may replicate the operationalisations made by the test developers, even though these operationalisations are restricted by demands for standardisation and reliable scoring. Consequently, the tests simultaneously broaden and narrow teachers’ pedagogical practices. 相似文献

15.

口试评分规范化与信度研究 总被引：2，自引：0，他引：2

郭茜邢如沈明波《清华大学教育研究》2003,(Z1)

口语考试的效度较高,信度却比较低。但没有信度,效度也不可能真正得到保证。因此,如何提高口试的信度,是很多测试研究者普遍关注的问题。本文通过描述清华大学英语水平考试中口试部分的评分规范化与评分员培训,对如何规范评分以提高口试信度这一问题进行讨论。相似文献

16.

John Edmonstone 《Action Learning: Research and Practice》2015,12(2):131-145

The paper examines the benefits claimed for action learning at individual, organisational and inter-organisational levels. It goes on to identify both generic difficulties in evaluating development programmes and action learning specifically. The distinction between formative and summative evaluation is considered and a summative evaluation framework is outlined, based on recent reviews of evaluations of development programmes, while recognising that establishing clear causal links remains problematic. 相似文献

17.

英语专业翻译教学中的形成性评价探究

李平《教育与教学研究》2012,(12):99-101

作为教学过程中极其重要的一环,教学评价对教学效果的影响举足轻重。对英语专业的翻译教学进行动态的、及时的形成性评价,将极大地改善翻译教学的现状。通过对终结性评价和形成性评价及其理论基础的分析可见,应当运用多种方式对翻译教学进行形成性评价并使其贯穿翻译教学的始终,才能让学生真正做到学有所得。相似文献

18.

Standards-based performance assessment for the evaluation of student teachers: a consequential validity study

Carmen Montecinos Sylvia Rittershaussen María Cristina Solís Inés Contreras Claudia Contreras 《Asia-Pacific Journal of Teacher Education》2010,38(4):285-300

The instrument Samples of Teaching Performance (STP) was developed to assess student teachers' capacity to plan, deliver and evaluate a unit of instruction. The current study reports consequential validity data collected from supervisors (n?=?20) and student teachers (n?=?62) from three elementary and five secondary teacher preparation programs in Chile that participated in the field-testing of the STP. Student teachers described how this assessment had honed their sense of professionalism and promoted learning of the skills assessed. Supervisors reported enlarging the topics discussed with student teachers and making some changes to the supervisory process. These findings are complemented by an analysis of the STP scores obtained by 24 student teachers, which showed better development of instructional skills when compared to pedagogical reasoning and reflection. These results raise questions about the structure of student teaching to support the implementation of standards-based assessments that entail tasks at different levels of cognitive complexity. 相似文献

19.

形成性评价与总结性评价理论探究 总被引：1，自引：0，他引：1

卢健《福建教育学院学报》2011,12(5):30-33

自从形成性评价和总结性评价理论提出以来,人们对它们的涵义及相互关系仍有不同诠释。实际上形成性评价和总结性评价是一个连续体的两个方面：评价起于总结性,而形成性评价实际上是总结性评价加反馈和改进的过程。相似文献

20.

英语口语教学评价机制探析 总被引：2，自引：0，他引：2

孙冬梅《潍坊教育学院学报》2011,(6):103-104

阐述英语口语教学评价的必要性和重要性,分析目前英语口语教学的现状,提出形成性评价与终结性评价相结合的英语口语综合评价体系,该评价体系对英语口语教学起关键性作用。相似文献