This meta-analysis (148 studies, k = 197, N = 31,718) examined the relationship between motivation and transfer in professional training. For this purpose, motivation was conceptualized in the following nine dimensions: motivation to learn, motivation to transfer, pre- and post-training self-efficacy, mastery orientation, performance orientation, avoidance orientation, expectancy, and instrumentality. Population correlation estimates ranged between −0.11 and 0.52. Three moderator effects were estimated. First, correlations were higher when the training focused on declarative and self-regulatory, rather than on procedural, knowledge. Second, learner-centered environments tended to show greater numbers of positive correlations than did knowledge-centered environments. Third, when compared with external, supervisory, or peer assessment, self-assessment of transfer produced upwardly biased population estimates irrespective of the transfer criterion. These findings are discussed in terms of their implications for theories of training effectiveness and their significance for the practice of training evaluation.  相似文献   


This study draws on the authors’ first-hand experience of designing, developing and delivering (3Ds) a massive open online course (MOOC) entitled ‘Understanding Research Methods’ since 2014, largely but not exclusively for learners in the humanities and social sciences. The greatest challenge facing us was to design an assessment mechanism that was (i) rigorous yet practicable at scale, vis-à-vis over 60,000 students from highly diverse backgrounds; (ii) compatible with the pedagogical orientation of the MOOC provider; and (iii) meaningful to the nature of the course subject. Based on a network analysis of forum interactions and a qualitative analysis of a random sample of 116 research questions proposed by students, we explore how participants’ understanding of research methods developed through a series of carefully sequenced ‘e-tivities’ and ‘open peer assessments’ over the duration of the course. The aim of this study was to consider a model of ‘flipped’ assessment, drawn from elements of ‘paragogy’ and the IR Model that acknowledges and exploits peer learning opportunities that are not routinely captured by completion statistics.  相似文献   

The consensual assessment technique (CAT) is a measurement tool for creativity research in which appropriate experts evaluate creative products [Amabile, T. M. (1996). Creativity in context: Update to the social psychology of creativity. Boulder, CO: Westview]. However, the CAT is hampered by the time-consuming nature of the products (asking participants to write stories or draw pictures) and the ratings (getting appropriate experts). This study examined the reliability of ratings of sentence captions. Specifically, four raters evaluated 12 captions written by 81 undergraduates. The purpose of the study was to see whether the CAT could provide reliable ratings of captions across raters and across multiple captions and, if so, how many such captions would be required to generate reliable scores, and how many judges would be needed? Using generalizability theory, we found that captions appear to be a useful way of measuring creativity with a reasonable level of reliability in the frame of CAT.  相似文献   

The “remembered success effect” (Finn, 2010) refers to the finding that challenging academic tasks that start or end with extra opportunities for success are often preferred to challenging tasks that do not include these opportunities. Research on the remembered success effect has identified some memory processes that are thought to give rise to the effect. To date there has been no research on how experiences of remembered success relate to motivational constructs that may be associated with the effect. Accordingly, we examined how challenging math experiences designed to induce remembered success impacted individuals’ expectancies for success, positive task value and perceived costs, and how these motivational constructs related to two future task choices; expectancy-value theory posits that expectancies and task values are the most direct motivational predictors of choice. In two studies, participants completed two challenging math tasks under two conditions: a short task of all difficult problems and a longer, “extended” task that had the same number of difficult problems plus a set of moderately difficult problems. Results demonstrated that expectancies and subjective task value were higher, and perceived costs lower in the “extended” condition than in the short condition. In both experiments, the between-task difference scores (i.e., extended task minus short task) for positive task values, expectancies, and perceived costs were significantly correlated with both task choices. Notably, the positive task value difference score uniquely predicted at least one of the two choices in both experiments. Costs and expectancies were less consistent unique predictors of choice: the between-task difference in perceived costs predicted one choice in Experiment 1, but neither choice Experiment 2, and the difference in expectancies only predicted the choices in Experiment 2.  相似文献   

This paper examines listening comprehension skills of visually impaired students (VIS) using computerised adaptive testing (CAT) and reader-assisted paper-pencil testing (raPPT) and student views about them. Explanatory mixed method design was used in this study. Sample is comprised of 51 VIS, in 7th and 8th grades. 9 of these students were interviewed for determining student views about tests. Results indicated that scores obtained from CAT are significantly lower than scores obtained from raPPT. Additionally, a positive and high correlation was found between scores of CAT and raPPT. This result suggests that similar ability estimations were made by CAT and raPPT. Another finding is CAT made more reliable predictions, and was completed in shorter duration using fewer items. In qualitative part, student views were gathered through interviews and content analysis revealed three themes as technical features, test features, and psychological effects. In general, students reported positive views about CAT. VIS prefer CAT due to its listening/control options, shorter test durations, clarity of reading, and fairness of test, elimination of dependency to reader. Study provides implications for test developers and test-users to consider CAT as a multi-accommodation for VIS through its advantages.  相似文献   


Despite the frequently reported association of characteristics of assessment policies with academic performance, the mechanisms through which these policies affect performance are largely unknown. Therefore, the current research investigated performance, motivation and self-regulation for two groups of students following the same statistics course, but under two assessment policies: education and child studies (ECS) students studied under an assessment policy with relatively higher stakes, a higher performance standard and a lower resit standard, compared with Psychology students. Results show similar initial performance, but more use of resits and higher final performance (post-resit) under the ECS policy compared with the psychology policy. In terms of motivation and self-regulation, under the ECS policy significantly higher minimum grade goals, performance self-efficacy, task value, time and study environment management, and test anxiety were observed, but there were no significant differences in aimed grade goals, academic self-efficacy and effort regulation. The relations of motivational and self-regulatory factors with academic performance were similar between both assessment policies. Thus, educators should be keenly aware of how characteristics of assessment policies are related to students’ motivation, self-regulation and academic performance.  相似文献   

Various methods of achievement attribution measurement are compared with regard to the construction of the achievement event and the measurement of the attributions elicited. The method of instigation and the content of the instruments depend greatly on whether situational or dispositional (individual differences) factors are emphasized. It is suggested that natural events, particularly those with pronounced effects, generate actual affective reactions and direct consequences and are particularly useful for studies of situational factors in attributions. On the other hand, hypothetical multiple-event measures are generally employed for studies of individual differences in attributions. The present review shows that questions on specific causes are more popular than those on attribution dimensions. Researchers should be cautious, however, because the dimensional meaning of these causes may vary across different cultures, age groups, or achievement settings. Different question formats and scoring methods also are compared. It is concluded that different methods have their own strengths and weaknesses and that researchers should select the one that best serves their purpose.  相似文献   


A new college admission policy will be implemented in Taiwan in 2022. The purpose of this study was to understand the relationship between admission criteria and college success. Data was obtained from the Taiwan Higher Education Database; a sample size of 8443 students from 156 universities was used in this study. By using the structural equation model, this study tested a research model that included factors such as motivation, standardized test scores, high school achievements, and college success. The findings revealed that the General Scholastic Ability Test scores (in Chinese, English, Social Studies) and high school average academic grades are significantly associated with college success. A student’s motivation to complete a certain major can significantly predict the quality of student effort and influence college success. These findings highlight the importance of some admission criteria and provide practical implications for educational policy-makers, school administrators, students, and parents.  相似文献   

The rise of computer‐based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple‐choice items. In particular, very short response time—termed rapid guessing—has been shown to indicate disengaged test taking, regardless whether it occurs in high‐stakes or low‐stakes testing contexts. This article examines rapid‐guessing behavior—its theoretical conceptualization and underlying assumptions, methods for identifying it, misconceptions regarding its dynamics, and the contextual requirements for its proper interpretation. It is argued that because it does not reflect what a test taker knows and can do, a rapid guess to an item represents a choice by the test taker to momentarily opt out of being measured. As a result, rapid guessing tends to negatively distort scores and thereby diminish validity. Therefore, because rapid guesses do not contribute to measurement, it makes little sense to include them in scoring.  相似文献   

The study investigates how higher education staff understand assessment, and the relationship between these understandings and their assessment practices. Nine individuals attended a workshop that guided them through the creation of a concept map about assessment, which was subsequently discussed in one-to-one semi-structured interviews. We found considerable variation in understanding of assessment, both between and within participants, and this appeared to be a consequence of the varied contexts within which assessment operates. Some assessment practices were highly complex, and at times closely entwined with teaching. In addition, individuals’ practices helped to illuminate variation in how underlying concepts (e.g. assessment for learning) were understood. The approach supported the construction of the participants’ understanding of assessment, and enabled the exploration of the interplay between thinking and reported practice, which were closely aligned. It also drew attention to the need to further develop methodologies which capture both the complexity of thinking about assessment and real-world assessment practices.  相似文献   


This article draws on three assessment paradigms – psychometrics, outcomes-based and curriculum-based assessment – to discuss paradigmatic changes in senior school assessment and achievement standard-setting in Queensland, Australia, over the last 50 years. These include radical reforms in 1970 from university-controlled examinations to school-based assessments applying normative standard-setting, to subsequent reforms in 1978 introducing competence(curriculum)-based assessment and standards. From 2019, a new reform introduces a combination of school-based and external assessment with procedures for establishing standards still in progress.

Changes to Queensland assessment and standard-setting are discussed in terms of three preconditions for paradigm change – dissatisfaction, an alternative acceptable paradigm, and majority acceptance of change. Influence of paradigmatic origins of reformers is discussed. The amalgam of curriculum-based assessment and psychometric paradigms in the new Queensland system is considered in terms of theoretical compatibility and potential impact on the new standards.  相似文献   

With the articulation of new ‘Holistic and Balanced Assessment’ initiatives in Singaporean schools, a new standard of conceptualising and enacting classroom assessment is expected of Singaporean teachers. This paper draws on findings from a larger study of ‘high-achieving’ Singaporean teachers’ deliberations and transactions of assessment activities. The use of case studies as a central methodology to investigate a contemporary phenomenon of education assessment extends the studies of conceptions and implementation of new classroom assessment practices in Anglophone and Western European countries. The findings from one of the ‘high-achieving’ case-study Singaporean teachers reveal that any quality assurance framework or guideline for evaluating teachers’ assessment practices needs to be sensitive to their intentions, meaning and context of teaching.  相似文献   

This paper reviews some of the literature on the use of groupwork as a form of assessment in tertiary institutions. It outlines the considerable advantages of groupwork but also its systemic associated problems. In discussing the problems, the paper considers issues such as “free riding” and the “sucker effect”, issues associated with ethnic mix in groups, and the social dilemma problem—in which students face conflicting demands between altruism and self-interest. The paper then outlines several models of effective groupwork and makes suggestions for implementing groupwork tasks. The paper also looks at the key assessment tasks which are commonly employed—namely, additive, conjunctive, disjunctive and discretionary tasks—and assesses which are most suited to groupwork. The paper considers the related issues of task complexity, recognition for effort, and strategies for minimising issues concerning group size. The paper also briefly considers strategies for implementing incentives for groupwork members, and outlines the issue of penalties for unproductive group members. The paper concludes by providing recommendations for how to maximise the advantages of groupwork while trying to minimise the disadvantages.  相似文献   

This article describes the development of a set of research-informed resources for assessing the spoken language skills (oracy) of students aged 11-12. The Cambridge Oracy Assessment Toolkit includes assessment tasks and procedures for use by teachers, together with a unique Skills Framework for identifying the range of skills involved in using talk in any specific social situation. As we explain, no comparable, 'teacher-friendly' instrument of this kind exists. Underpinning its development is the argument that teaching children how to use their first or main language effectively across a range of social contexts should be given higher priorityv in educational policy and school practice, and that the development of robust, practicable ways of assessing oracy will help to achieve that goal. We explain how the Toolkit has been developed and validated with children and teachers in English secondary schools, and discuss its strengths and limitations.  相似文献   

Metacognitive strategy knowledge, motivation, and learning strategies play an important role in self-regulated learning (SRL). However, little is known about different profiles of self-regulated learners in schools that prepare students for the university entrance certificate. The aim of this study was to examine intraindividual differences in the patterns of students' SRL. In this 2-wave longitudinal study, 897 students were involved. Latent class analyses revealed four-cluster solutions at the beginning as well as at the end of the school year. Maximal self-regulated learners with the highest levels on all cognitive, metacognitive, and motivational components of SRL reported the highest grades in the academic subject of German (first language) at both measurement points, followed by motivated and strategic learners. Students with a low level on several SRL components reported the lowest grades. Further, the results indicated changes in profiles of SRL over time.  相似文献   

This study focuses on understanding how a participation in a data team develops data skills and data use in individual teacher educators. Five teacher educators collaborated in a data team that used data to solve the problem of student teachers dropping out during their course of study. This study aimed at understanding how teacher educators learn from their participation in the data team. We collected data through interviews, surveys, and a knowledge test, and gained insight into the development of data skills, attitude towards data use, and the teacher educators' data use in daily practice. The results show that the data team members’ data skills and attitudes towards data use changed in different ways during the data team intervention depending on their initial situation, and that overall, their data use for school development increased.  相似文献   

In spite of the rising tide of metrics in UK higher education, there has been scant attention paid to assessment loads, when evidence demonstrates that heavy demands lead to surface learning. Our study seeks to redress the situation by defining assessment loads and comparing them across research and teaching intensive universities. We clarify the concept of ‘assessment load’ in response to findings about high volumes of summative assessment on modular degrees. We define assessment load across whole undergraduate degrees, according to four measures: the volume of summative assessment; volume of formative assessment; proportion of examinations to coursework; number of different varieties of assessment. All four factors contribute to the weight of an assessment load, and influence students’ approaches to learning. Our research compares programme assessment data from 73 programmes in 14 UK universities, across two institutional categories. Research-intensives have higher summative assessment loads and a greater proportion of examinations; teaching-intensives have higher varieties of assessment. Formative assessment does not differ significantly across both university groups. These findings pose particular challenges for students in different parts of the sector. Our study questions the wisdom that ‘more’ is always better, proposing that lighter assessment loads may make room for ‘slow’ and deep learning.  相似文献   

Assessment has become an important topic of debate and even reform in many Western countries. It is equally important in other regions of the world, although less subject to reform. Yet discussions of assessment across cultural boundaries are not frequent and in a globalizing world this can be problematic. The purposes of this article, therefore, are to review concepts such as ‘formative’ and ‘summative’ assessment and how they have developed over time. A focus of this review will be to identify the implications of different kinds of assessment for student learning, especially in relation to the cultural contexts in which they take place. The article will argue that different forms of assessment can be directed towards different learning purposes, especially where cultural pressures dictate certain kinds of assessment practices. Valorizing one form of assessment over another may well be counterproductive in particular cultural contexts.  相似文献   

This article explores the effect on assessment of ‘translating’ paper and pencil test items into their computer equivalents. Computer versions of a set of mathematics questions derived from the paper-based end of key stage 2 and 3 assessments in England were administered to age appropriate pupil samples, and the outcomes compared. Although in most cases the change to the different medium seems to make little difference, for some items the affordances of the computer profoundly affect how the question is attempted, and therefore what is being assessed when the item is used in a test. These differences are considered in terms of validity and legitimacy, that is whether the means used to answer a question in a particular medium are appropriate to the assessment intention. The conclusion is not only that translating paper and pencil items into the computer format sometimes undermines their validity as assessments, it is also that some paper and pencil items are less valid as assessments than their computer equivalents would be.  相似文献   

