首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Machine learning has been frequently employed to automatically score constructed response assessments. However, there is a lack of evidence of how this predictive scoring approach might be compromised by construct-irrelevant variance (CIV), which is a threat to test validity. In this study, we evaluated machine scores and human scores with regard to potential CIV. We developed two assessment tasks targeting science teacher pedagogical content knowledge (PCK); each task contains three video-based constructed response questions. 187 in-service science teachers watched the videos with each had a given classroom teaching scenario and then responded to the constructed-response items. Three human experts rated the responses and the human-consent scores were used to develop machine learning algorithms to predict ratings of the responses. Including the machine as another independent rater, along with the three human raters, we employed the many-facet Rasch measurement model to examine CIV due to three sources: variability of scenarios, rater severity, and rater sensitivity of the scenarios. Results indicate that variability of scenarios impacts teachers’ performance, but the impact significantly depends on the construct of interest; for each assessment task, the machine is always the most severe rater, compared to the three human raters. However, the machine is less sensitive than the human raters to the task scenarios. This means the machine scoring is more consistent and stable across scenarios within each of the two tasks.  相似文献   

3.
Rubrics are widely used in higher education to assess performance in project-based learning environments. To date, the sources of error that may affect their reliability have not been studied in depth. Using generalisability theory as its starting-point, this article analyses the influence of the assessors and the criteria of the rubrics on the assessment of two service-learning projects. A sample of 365 novice students studying for three different undergraduate degrees was evaluated by eight student assessors and two teachers at three stages of assessment. Depending on the type of project and the stage of assessment, between 19.27 and 39.55% of the total variance was attributed to the quality of the projects, 0–7.49% to the main effect of the raters, and 3.44–17.3% to the main effect of the criteria. The results demonstrated that acceptable levels of reliability (≥.70) were obtained with three raters and eight criteria or four raters and nine criteria in contexts of relative or absolute decisions, respectively.  相似文献   

4.
ABSTRACT

We present a conceptual framework that leverages synergies between classroom assessment (CA) practices and self-regulated learning (SRL) theory to support academic growth and instruction. We articulate the processes shared by CA and SRL, drawing on a model of SRL with three phases: forethought, performance, and self-reflection. We blend this SRL model with CA to create the CA:SRL framework in four stages: (1) pre-assessment, (2) the cycle of learning, doing, and assessing, (3) formal assessment, and (4) summarizing assessment evidence. We elucidate how SRL processes are involved at each stage and can be drawn on to support learning development and teacher understanding and co-regulation of learning. This framework is important in that it depicts how assessment and learning processes interact dynamically for both teachers and students in classrooms, and demonstrates that such interactions encompass the full breadth of purposes in CA, from planning through summation of evidence.  相似文献   

5.
This article reviews the international literature on video viewing in teacher education and professional development. Two hundred and fifty-five articles were collected, summarized and categorized using a conceptualization that includes four aspects: teachers' activity as they view a classroom video, the objectives of video viewing, the types of videos viewed, and the effects of video viewing on teacher education and professional development. The findings in each of these aspects suggested three main questions that may profitably guide future research: How can teaching teachers to identify and interpret relevant classroom events on video clips improve their capacity to perform the same activities in the classroom? How can we best articulate the diverse objectives of video viewing and the diverse types of videos in teacher education and professional development programs? How can we create a “continuum” between teacher education programs and professional development programs in such a way that video viewing becomes a routine, familiar professional practice able to produce the desired effects over the course of an entire teaching career?  相似文献   

6.
In the United Kingdom, the majority of national assessments involve human raters. The processes by which raters determine the scores to award are central to the assessment process and affect the extent to which valid inferences can be made from assessment outcomes. Thus, understanding rater cognition has become a growing area of research in the United Kingdom. This study investigated rater cognition in the context of the assessment of school‐based project work for high‐stakes purposes. Thirteen teachers across three subjects were asked to “think aloud” whilst scoring example projects. Teachers also completed an internal standardization exercise. Nine professional raters across the same three subjects standardized a set of project scores whilst thinking aloud. The behaviors and features attended to were coded. The data provided insights into aspects of rater cognition such as reading strategies, emotional and social influences, evaluations of features of student work (which aligned with scoring criteria), and how overall judgments are reached. The findings can be related to existing theories of judgment. Based on the evidence collected, the cognition of teacher raters did not appear to be substantially different from that of professional raters.  相似文献   

7.
The use of the Experiences of Teaching & Learning Questionnaire (ETLQ) for the evaluation of learning quality in higher education has been expanding during the last decade, thus a review of the instrument’s validity evidence is warranted. The design of the study was a systematic critical literature review. We evaluated the strength of the validity evidence of 17 included studies with a quality appraisal framework reflecting current standards for educational testing. The evidence supporting the central validity assumptions of the ETLQ scales is currently weak to moderate and incomplete. Thus, caution against the uncritical use of ETQL scores for high-stakes educational decisions is warranted. The appraisal framework used was useful for creating an overview of the evidence. However, attention to more general aspects of study quality, and consensus deliberations with three to four raters was also important for sufficiently reliable appraisal of the evidence.  相似文献   

8.
We cite four disconnections among teacher education programmes, research on teaching, and programme assessment that contribute to a paucity of systematically collected evidence and the inability of teacher educators to fully address the “outcomes question” [Cochran-Smith, M. (2003). Assessing assessment in teacher education. Journal of Teacher Education, 54, 187–191] now central to the conduct and future of teacher education programmes. To reduce those disconnections, we present the Development, Research, and Improvement model of programme assessment [Metzler, M. W., & Tjeerdsma, B. L. (1998). PETE program assessment within a development, research, and improvement framework. Journal of Teaching in Physical Education, 17, 468–492] that has guided a comprehensive, longitudinal, and research-based assessment project at Georgia State University in the United States for 13 years. We situate this work in the framework of Self-Study of Teacher Education, now gaining attention worldwide as a legitimate approach to bridging the methodological and evidentiary gap between teacher education programmes, research on teaching, and programme assessment. Examples of data collected in the longitudinal programme are described, along with illustrations of how those data have guided decisions about our teacher education programme, and how those findings can add to the empirical knowledge in teacher education.  相似文献   

9.
This study was undertaken to determine the concerns of primary school teachers about the inclusion of students with disabilities in Ahmedabad, India. A total of 560 teachers, working in government‐run schools, returned the completed survey. A two‐part questionnaire was used in this study. Part 1 gathered information relating to personal and professional characteristics of the teachers. Part 2 was a 21‐item Likert scale titled Concerns about Inclusive Education – Gujarati. The major finding of the study was that the teachers in Ahmedabad were moderately concerned about including students with disabilities in their classrooms. The teachers were most concerned about lack of infrastructural resources and least concerned about lack of social acceptance of students with disabilities in inclusive education classrooms. Significant differences existed in teacher concerns based on the following background variables: gender, qualifications in special education, teaching experience and number of students with disabilities in class. A number of implications are discussed to address teacher concerns for inclusive education in India.  相似文献   

10.
The purpose of this study was to investigate teacher and school psychologists' knowledge of Attention‐Deficit/Hyperactivity Disorder (ADHD). One hundred thirty‐two kindergarten through 12th‐grade general education teachers, special education teachers, and school psychologists responded to a 24‐item questionnaire concerning treatment and possible causes of ADHD. The results supported the hypothesis that school psychologists' knowledge level of ADHD would be significantly greater than the knowledge level of special and general education teachers, but did not support the hypothesis that the knowledge level of special education teachers would be significantly greater than the knowledge level of general education teachers. Increased years of professional experience was negatively associated with increased knowledge about ADHD. Implications and suggestions for future research are discussed. © 2009 Wiley Periodicals, Inc.  相似文献   

11.
Professional standards in teaching are developed in many education systems, with professional learning and quality assurance being the central purposes of these standards. This paper presents an initiative in developing a professional development progress map (hereafter, progress map) within a learning‐oriented field experience assessment (LOFEA) framework. The article examines the use of a progress map to support professional learning in teaching supervision in the field experience of a teacher education programme. Views of users, including 16 tertiary supervisors and 21 teacher participants of the in‐service programmes, were collected. Issues relating to supporting student teachers' professional learning with standards‐referenced assessment, are discussed around four themes, namely intention, instrumentation, interpretation and implementation.  相似文献   

12.
The problem‐solving model (PSM) is used in the Minneapolis Public Schools to guide decisions regarding: (1) interventions in general education, (2) referral to special education, and (3) evaluation for special education eligibility for high‐incidence disability areas. District implementation was driven by four themes: the appropriateness of intelligence tests and the IQ‐achievement discrepancy for determination of eligibility, bias in assessment, allocation of school psychologist time, and linking assessment to instruction through curriculum‐based measurement. This article describes how the PSM was designed as a three‐stage process to measure response to intervention and used in the special education eligibility process. Program evaluation data collected since initial implementation in 1994 is reported in the areas of child count, achievement, referral, eligibility, and disproportion. The authors discuss the limitations of conducting PSM research in school settings, barriers to implementation of PSM, and make suggestions for enhancing treatment integrity.  相似文献   

13.
Successful implementation of inclusive practices depends mainly on teachers' attitudes towards children with special needs and their inclusion, and teachers' willingness to work with children with special needs in their classrooms. Experiences teacher candidates have during pre‐service stage might influence their perceptions towards children with disabilities and their inclusion. The purpose of this study was to examine the impact of two special education courses on (1) preschool teacher candidates' general attitudes towards inclusion, (2) their willingness to work with children with significant intellectual, physical and behavioural disabilities within inclusive classroom settings and (3) their level of comfort in interacting with children with disabilities. A four‐part survey was administered to participants four times throughout the study, once before and after each course. The survey package included (1) a demographic information form, (2) the Opinions Relative to the Inclusion of Students with Disabilities Scale, (3) an adapted version of the Teachers' Willingness to Work with Children with Severe Disabilities Scale and (4) the Interaction with Children with a Disability Scale. The results showed that both special education courses positively influenced teacher candidates' attitudes, willingness and comfort levels. However, impact of the second course focused on helping teacher candidates learn and apply instructional strategies to work with children with disabilities in inclusive classrooms was much larger. Implications of the study findings in relation to future research and practice are discussed.  相似文献   

14.
This research was a co‐creation and co‐assessment exercise between the researchers, participating printmaking and weaving academics and their students in a Nigerian university. The poor technical resources and increasingly large student groups in the design department, which severely hampers the delivery of an effective education, were addressed. The academics were supported to learn how to create their own instructional videos for their students, demonstrating identified designer‐maker skills and how to use required equipment. These academics are now empowered and have the knowledge to produce their own instructional videos without professional assistance. This is also irrespective of their previous experiences of using video equipment and developing video content.  相似文献   

15.
The purpose of this study was to build a Random Forest supervised machine learning model in order to predict musical rater‐type classifications based upon a Rasch analysis of raters’ differential severity/leniency related to item use. Raw scores (N = 1,704) from 142 raters across nine high school solo and ensemble festivals (grades 9–12) were collected using a 29‐item Likert‐type rating scale embedded within five domains (tone/intonation, n = 6; balance, n = 5; interpretation, n = 6; rhythm, n = 6; and technical accuracy, n = 6). Data were analyzed using a Many Facets Rasch Partial Credit Model. An a priori k‐means cluster analysis of 29 differential rater functioning indices produced a discrete feature vector that classified raters into one of three distinct rater‐types: (a) syntactical rater‐type, (b) expressive rater‐type, or (c) mental representation rater‐type. Results of the initial Random Forest model resulted in an out‐of‐bag error rate of 5.05%, indicating that approximately 95% of the raters were correctly classified. After tuning a set of three hyperparameters (ntree, mtry, and node size), the optimized model demonstrated an improved out‐of‐bag error rate of 2.02%. Implications for improvements in assessment, research, and rater training in the field of music education are discussed.  相似文献   

16.
A robust body of evidence supports the finding that particular teaching and assessment strategies in the K‐12 classroom can improve student achievement. While experts have identified many effective teaching and learning practices in the assessment for learning literature, teachers’ knowledge and use of “high leverage” formative assessment (FA) practices are difficult to model in novice populations. By employing advances in construct modeling, the theoretical underpinnings of learning progressions research, and four principles of evidence‐centered design, teacher educators along with psychometricians can test hypotheses about teacher learning progressions. Utilizing an FA moves‐based framework, the article examines how beginning teachers’ posing, pausing, and probing practices align with five key strategies of FA. Examples of construct maps, instructional tasks, and turns of talk analysis using scoring guides are provided from an empirical study of novice science preservice teachers in a high‐needs school district.  相似文献   

17.
Co‐teaching is a popular strategy for implementing the inclusion of students with disabilities within secondary general education classrooms. However, we have little data regarding its effectiveness under routine conditions of educational practice. This study examined whether there was an “additive effect” of the special education teacher on the instructional experiences of students with disabilities as compared with the experiences of the same students taught by only the general education teacher under routine conditions. Observers used time sampling methods to document how students with disabilities spent their time in 11 middle school co‐taught classes. Statistically significant differences were found for targeted students in terms of general education teacher interaction and individual instruction. General education teachers spent significantly less time with students with disabilities when the special education teacher was present. In addition, students with disabilities received significantly more individual instruction when the special education teacher was present. However, these differences were of limited practical significance.  相似文献   

18.
Research indicates that instructional aspects of teacher performance are the most difficult to reach consensus on, significantly limiting teacher observation as a way to systematically improve instructional practice. Understanding the rationales that raters provide as they evaluate teacher performance with an observation protocol offers one way to better understand the training efforts required to improve rater accuracy. The purpose of this study was to examine the accuracy of raters evaluating special education teachers’ implementation of evidence-based math instruction. A mixed-methods approach was used to investigate: 1) the consistency of the raters’ application of the scoring criteria to evaluate teachers’ lessons, 2) raters’ accuracy on two lessons with those given by expert-raters, and 3) the raters’ understanding and application of the scoring criteria through a think-aloud process. The results show that raters had difficulty understanding some of the high inference items in the rubric and applying them accurately and consistently across the lessons. Implications for rater training are discussed.  相似文献   

19.
Today's teacher education programmes across the world strive to equip future teachers with the high‐quality knowledge, skills and dispositions necessary to teach students. The assessment of teacher dispositions has thus become essential to cultivate those qualities. However, the current approach to disposition assessment in the United States focuses on personal characteristics and character‐related dispositions and is frequently used as a sorting device to identify those who appear to be inadequately disposed to teaching. Expanding on earlier work by the author and colleagues, this paper examines the issue of whether more efforts should be made to incorporate elements to assess competence‐related dispositions in conjunction with the character‐related dispositions across assessment tools and, if so, how this could be accomplished. In addition, this paper will clarify some dispositional concepts and terms used interchangeably that actually differ from one another and can confuse the consistency of disposition assessment. Finally, a framework for assessing technology disposition as an example of competence‐related disposition and for broadening the focus of disposition assessment is suggested.  相似文献   

20.
The research reported in this study examines the very first time the participants planned for and enacted science instruction within a “best-case scenario” teacher preparation program. Evidence from this study indicates that, within this context, preservice teachers are capable of implementing several of the discursive practices of science called for in standards documents including engaging students in science investigations and constructing evidence-based explanations. The participants designed experiences that allowed their students to interact with natural phenomena, gather evidence, and craft explanations of natural phenomenon. The study contends that the participants were able to achieve such successes due to their participation in a teacher education program and field placement, which were designed using a comprehensive, conceptual framework. Video of the participant’s teaching and annotated self-analysis videos served as the primary data for this study. Implications for future research and elementary science teacher education are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号