首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 687 毫秒
1.
The hierarchical rater model (HRM) re‐cognizes the hierarchical structure of data that arises when raters score constructed response items. In this approach, raters’ scores are not viewed as being direct indicators of examinee proficiency but rather as indicators of essay quality; the (latent categorical) quality of an examinee's essay in turn serves as an indicator of the examinee's proficiency, thus yielding a hierarchical structure. Here it is shown that a latent class model motivated by signal detection theory (SDT) is a natural candidate for the first level of the HRM, the rater model. The latent class SDT model provides measures of rater precision and various rater effects, above and beyond simply severity or leniency. The HRM‐SDT model is applied to data from a large‐scale assessment and is shown to provide a useful summary of various aspects of the raters’ performance.  相似文献   

2.
This article reports on the development of online assessment tools for disengaged youth in flexible learning environments. Sociocultural theories of learning and assessment and Bourdieu's sociological concepts of capital and exchange were used to design a purpose-built content management system. This design experiment engaged participants in assessment that led to the exchange of self, peer and teacher judgements for credentialing. This collaborative approach required students and teachers to adapt and amend social networking practices for students to submit and judge their own and others' work using comments, ratings, keywords and tags. Students and teachers refined their evaluative expertise across contexts, and negotiated meanings and values of digital works, which gave rise to revised versions and emergent assessment criteria. By combining social networking tools with sociological models of capital, assessment activities related to students' digital productions were understood as valuations and judgements within an emergent, negotiable social field of exchange.  相似文献   

3.
Many language proficiency tests include group oral assessments involving peer interaction. In such an assessment, examinees discuss a common topic with others. Human raters score each examinee's spoken performance on specially designed criteria. However, measurement models for analyzing group assessment data usually assume local person independence and thus fail to consider the impact of peer interaction on the assessment outcomes. This research advances an extended many-facet Rasch model for group assessments (MFRM-GA), accounting for local person dependence. In a series of simulations, we examined the MFRM-GA's parameter recovery and the consequences of ignoring peer interactions under the traditional modeling approach. We also used a real dataset from the English-speaking test of the Language Proficiency Assessment for Teachers (LPAT) routinely administered in Hong Kong to illustrate the efficiency of the new model. The discussion focuses on the model's usefulness for measuring oral language proficiency, practical implications, and future research perspectives.  相似文献   

4.
The argument of this article is that assessment in higher education in the professions can benefit from quality assessment tasks linked to professional practice. Such an assessment task would need to be authentic requiring considerable intellectual skill as well as attending to the realities of professional demands. The idea of authentic assessment is developed by using five of Boud et al.'s propositions in higher educational assessment. This idea is illustrated by the use of action research in a teaching internship, that is, data driven learning in the workplace which also serves as an assessment task in the final year of a professional Bachelor degree. Some difficulties and some illustrative, positive student reactions are presented.  相似文献   

5.
This study develops a framework to conceptualize the use and evolution of machine learning (ML) in science assessment. We systematically reviewed 47 studies that applied ML in science assessment and classified them into five categories: (a) constructed response, (b) essay, (c) simulation, (d) educational game, and (e) inter-discipline. We compared the ML-based and conventional science assessments and extracted 12 critical characteristics to map three variables in a three-dimensional framework: construct, functionality, and automaticity. The 12 characteristics used to construct a profile for ML-based science assessments for each article were further analyzed by a two-step cluster analysis. The clusters identified for each variable were summarized into four levels to illustrate the evolution of each. We further conducted cluster analysis to identify four classes of assessment across the three variables. Based on the analysis, we conclude that ML has transformed—but not yet redefined—conventional science assessment practice in terms of fundamental purpose, the nature of the science assessment, and the relevant assessment challenges. Along with the three-dimensional framework, we propose five anticipated trends for incorporating ML in science assessment practice for future studies: addressing developmental cognition, changing the process of educational decision making, personalized science learning, borrowing 'good' to advance 'good', and integrating knowledge from other disciplines into science assessment.  相似文献   

6.
ABSTRACT

As in other areas of the school curriculum, the teaching, learning and assessment of higher order thinking in statistics has become an issue for educators following the appearance of recent curriculum documents in many countries. These documents have included probability and statistics across all years of schooling and have stressed the importance of higher order thinking across all areas of the mathematics curriculum. This paper reports on a pilot project which applied the theoretical framework for cognitive development devised by Biggs and Collis to a higher order task in data handling in order to provide a model of student levels of response. The model will assist teachers, curriculum planners and other researchers interested in increasing levels of performance on more complex tasks. An interview protocol based on a set of 16 data cards was developed, trialed with Grade 6 and 9 students, and adapted for group work with two classes of Grade 6 students. The levels and types of cognitive functioning associated with the outcomes achieved by students completing the task in the two contexts will be discussed, as will the implications for classroom teaching and for further research.  相似文献   

7.
To improve assessments of academic achievement, test developers have been urged to use an “assessment triangle” that starts with research‐based models of cognition and learning [NRC (2001) Knowing what students know: The science and design of educational assessment. Washington, DC: National Academy Press]. This approach has been successful in designing high‐quality reading and math assessments, but less progress has been made for assessments in content‐rich sciences such as biology. To rectify this situation, we applied the “assessment triangle” to design and evaluate new items for an instrument (ACORNS, Assessing Contextual Reasoning about Natural Selection) that had been proposed to assess students' use of natural selection to explain evolutionary change. Design and scoring of items was explicitly guided by a cognitive model that reflected four psychological principles: with development of expertise, (1) core concepts facilitate long‐term recall, (2) causally‐central features become weighted more strongly in explaining phenomena, (3) normative ideas co‐exist but increasingly outcompete naive ideas in reasoning, and (4) knowledge becomes more abstract and less specific to the learning situation. We conducted an evaluation study with 320 students to examine whether scores from our new ACORNS items could detect gradations of expertise, provide insight into thinking about evolutionary change, and predict teachers' assessments of student achievement. Findings were consistent with our cognitive model, and ACORNS was revealing about undergraduates' thinking about evolutionary change. Results indicated that (1) causally‐central concepts of evolution by natural selection typically co‐existed and competed with the presence of naïve ideas in all students' explanations, with naïve ideas being especially prevalent in low‐performers' explanations; (2) causally‐central concepts were elicited most frequently when students were asked to explain evolution of animals and familiar plants, with influence of superficial features being strongest for low‐performers; and (3) ACORNS scores accurately predicted students' later achievement in a college‐level evolution course. Together, findings illustrate usefulness of cognitive models in designing instruments intended to capture students' developing expertise. © 2012 Wiley Periodicals, Inc. J Res Sci Teach 49: 744–777, 2012  相似文献   

8.
The National Education Monitoring Project (NEMP) is responsible in New Zealand for the national assessment of primary school children's achievement in the essential learning areas, one of which is social studies. Individual interviews are one of the approaches used to assess students' understanding. The assessors are registered teachers, selected by NEMP, who attend a training week where they learn how to conduct the standardised assessment tasks and associated interviews. This study examines the reliability of the assessment interviews used in the 2005 round of social studies monitoring, in particular the variations between teacher administrators (TAs) in their use of prompts and probes. The extent of variation observed in the use of three kinds of prompt was sufficient to raise questions about the reliability of the assessment process. A surprising outcome was the consistent failure of TAs to clarify and elucidate students' social studies understandings through the judicious use of probes. The prompt-related variations between TAs and their failure fully to ‘tap into’ understandings assessed by interview-based tasks are serious threats to the validity of claims regarding students' achievement in social studies. This is particularly concerning as NEMP data are used as the basis for identifying and reporting national patterns and trends in educational performance and making recommendations to policy-makers, curriculum planners and educators.  相似文献   

9.
ABSTRACT

Until recently, the classroom assessment literature has emphasized the role of teachers and tests, for example investigating teachers’ assessment practices or the quality of classroom tests and other assessments. In contrast, current understandings of teaching and learning emphasize the role of students, as well as the complex interactions between teachers, students, and contexts. We use the literature review method to give substance to a theory of classroom assessment as the co-regulation of learning by teachers, students, instructional materials, and contexts. We organize the literature using a version of Pintrich and Zusho’s theory of the phases and areas of the self-regulation of learning, expanded to include the co-regulation of learning, in order to demonstrate how classroom assessment is related to all aspects of the regulation of learning. We conclude that this is a useful expansion for the field.  相似文献   

10.
In this article, we report the findings of an exploratory empirical study that investigated the relationship between English Language Proficiency (ELP) on performance on the Woodcock‐Johnson Tests of Cognitive Abilities‐Third Edition (WJ III) when administered in English to bilingual students of varying levels of ELP. Sixty‐one second‐grade students, identified as Limited English Proficient, were recruited from a suburban public school district and were given the WJ III in addition to their annual state standardized assessment of ELP. The findings of this study provide evidence to support a linear, inverse relationship between ELP and performance on tests that require higher levels of English language development and mainstream cultural knowledge. The implications of the findings of the present study suggest that practitioners must consider an examinee's level of developmental language proficiency and cultural knowledge acquisition as continuous variables when determining the impact of such factors on test performance and evaluation regarding whether scores obtained from tests administered in English are indeed valid for interpretation.  相似文献   

11.
Formative assessment is considered to be helpful in students' learning support and teaching design. Following Aufschnaiter's and Alonzo's framework, formative assessment practices of teachers can be subdivided into three practices: eliciting evidence, interpreting evidence and responding. Since students' conceptions are judged to be important for meaningful learning across disciplines, teachers are required to assess their students' conceptions. The focus of this article lies on the discussion of learning analytics for supporting the assessment of students' conceptions in class. The existing and potential contributions of learning analytics are discussed related to the named formative assessment framework in order to enhance the teachers' options to consider individual students' conceptions. We refer to findings from biology and computer science education on existing assessment tools and identify limitations and potentials with respect to the assessment of students' conceptions.

Practitioner notes

What is already known about this topic
  • Students' conceptions are considered to be important for learning processes, but interpreting evidence for learning with respect to students' conceptions is challenging for teachers.
  • Assessment tools have been developed in different educational domains for teaching practice.
  • Techniques from artificial intelligence and machine learning have been applied for automated assessment of specific aspects of learning.
What does the paper add
  • Findings on existing assessment tools from two educational domains are summarised and limitations with respect to assessment of students' conceptions are identified.
  • Relevent data that needs to be analysed for insights into students' conceptions is identified from an educational perspective.
  • Potential contributions of learning analytics to support the challenging task to elicit students' conceptions are discussed.
Implications for practice and/or policy
  • Learning analytics can enhance the eliciting of students' conceptions.
  • Based on the analysis of existing works, further exploration and developments of analysis techniques for unstructured text and multimodal data are desirable to support the eliciting of students' conceptions.
  相似文献   

12.
Internationally, many assessment systems rely predominantly on human raters to score examinations. Arguably, this facilitates the assessment of multiple sophisticated educational constructs, strengthening assessment validity. It can introduce subjectivity into the scoring process, however, engendering threats to accuracy. The present objectives are to examine some key qualitative data collection methods used internationally to research this potential trade‐off, and to consider some theoretical contexts within which the methods are usable. Self‐report methods such as Kelly's Repertory Grid, think aloud, stimulated recall, and the NASA task load index have yielded important insights into the competencies needed for scoring expertise, as well as the sequences of mental activity that scoring typically involves. Examples of new data and of recent studies are used to illustrate these methods’ strengths and weaknesses. This investigation has significance for assessment designers, developers and administrators. It may inform decisions on the methods’ applicability in American and other rater cognition research contexts.  相似文献   

13.
The purpose of this qualitative exploratory study was to identify factors that influenced prospective and experienced secondary level science teachers' reasoning as they evaluated or selected tasks to formatively assess their students' understanding of scientific concepts. The analysis of the coded written responses revealed two categories of factors that influenced the teachers' reasoning: (1) characteristics of the task and (2) characteristics of students or the curriculum. Characteristics of the task related to qualities of the task regardless of the learning environment in which it would be used, such as the level of student thinking demanded by a task. Characteristics of the students and the curriculum related to the learning environment in which an assessment task would be implemented, such as students' abilities to complete the task. Both prospective and experienced teachers' task evaluations were influenced by the same factors related to the characteristics of the task, although their interpretations of the meaning of each factor varied. In addition, experienced teachers' task evaluations were more likely than prospective teachers to be influenced by factors related to characteristics of students and the curriculum. The findings are discussed as a conceptual framework that presents the identified factors along three different dimensions: (1) the influence of task, student, and curriculum characteristics, (2) the influence of expectations for success, and (3) the influence of teaching experience. © 2008 Wiley Periodicals, Inc. J Res Sci Teach 45: 1113–1130, 2008  相似文献   

14.
15.
Although needs assessment methodologies have been readily available since the '40's considerable confusion persists about the nature and purpose for which they should be used. Many practitioners equate needs assessment with job or task analysis and thereby fail to derive much benefit from the effort they expend. Others, in trying to distinguish between internal “quasineeds” assessment and external “self-sufficiency” criteria inadvertently pose some questions for the nature of competency. Is it a process referent, a product referent, or both? This paper examines the differences between needs assessment and task analysis and demonstrates the utility of conceptualizing competency as the synthesis of the dichotomy of process and products through a Needs Based Education and Training (NEBEAT) model of curriculum design in adult education.  相似文献   

16.
The purpose of this article is to question the suitability of the phonics screening check in relation to models and theories of reading development. The article questions the appropriateness of the check by drawing on theoretical frameworks which underpin typical reading development. I examine the Simple View of Reading developed by Gough and Tunmer and Ehri's model of reading development. The article argues that the assessment of children's development in reading should be underpinned and informed by a developmental framework which identifies the sequential skills in reading development.  相似文献   

17.
The focus of this article is recent work by the Assessment Reform Group (ARG) on the role of teachers' judgements in the summative use of assessment. A brief overview of the early work of the ARG is followed by discussion of the desirable properties of assessment for summative uses. The work of the ARG's Assessment Systems for the Future project provided evidence and arguments concerning the validity, reliability, impact and cost of tests and of summative assessment by teachers. Whilst there is ample evidence that the teachers' judgements are more valid than, and equally reliable as, tests, there is a danger of unwanted impact on teaching as long as results are used for ‘high stakes’ evaluation of teachers and schools. Implications for policy include an end to the practice of using the results of pupils' summative assessment, however they are derived, as the sole basis for target setting and school accountability.  相似文献   

18.
Many studies have shown that assessment in the classroom is a very complex procedure. As the first part of this paper explains, it is developed in four consecutive phases. That is, evidence collection, interpretation of this evidence, the teacher's responses and, finally, the impact of teacher's responses on children. Together these phases form the assessment episode framework. The second part of the paper illustrates the use of this framework in practice by exploring how Greek primary school teachers were carrying out assessments in the classroom, and considers the impact of these practices on children's learning. The paper draws upon evidence gathered by questionnaire and classroom observations of Greek primary school teachers. In order to clarify the complex issues of classroom assessment this article explores them in the light of the four developmental phases of the assessment episode. The quality of teacher comments, the assessment language used, the role of grading, and the relationship between policy and practice are also examined. The findings emphasise the importance of teachers being aware of the potential effects of assessment and the consequent need to use it properly for the benefit of teaching and learning.  相似文献   

19.
The paper reports the findings of evaluative research that attempted to rigorously assess the efficacy of a feed-forward, formative assessment intervention. The aim was to improve participants’ conceptions of quality, and hence improve the quality of a complex piece of summative assessment, by asking them to mark exemplars produced by former students. Feed-forward assessment has theoretical support in the literature, but empirical confirmation has been slight. Research findings were encouraging. A statistical model incorporating feed-forward was developed which accounted for a large effect in the improvement of results for the summative item. Importantly, there was improvement across student ability levels. Students, in the main, made accurate judgements about different levels of exemplar quality, although they had some difficulty discerning different pathways to high-quality products. Qualitative analysis indicated improved student conceptions of coherence and integration in the summative piece.  相似文献   

20.
This article presents findings of an attempt to test Creemers' model of educational effectiveness by using data derived from an evaluation study in Mathematics in which 30 schools, 56 classes and 1,051 pupils of the last year of primary school of Cyprus participated. More specifically, we examine whether the pupil, classroom and school variables show the expected effects on pupils' achievement in Mathematics. Research data concerned with pupils' achievement in Mathematics were collected by using two different forms of assessment (external assessment and teacher's assessment). Questionnaires were administered to pupils and teachers in order to collect data about most of the variables included in Creemers' model. The findings support the main assumptions of the model. The influences on pupil achievement are multilevel and the net effect of classrooms was higher than that of schools. Implications for the development of research on school effectiveness are drawn.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号