共查询到20条相似文献,搜索用时 0 毫秒
1.
Kristin E. Porter Sean F. Reardon Fatih Unlu Howard S. Bloom Joseph R. Cimpian 《Journal of research on educational effectiveness》2017,10(1):138-167
A valuable extension of the single-rating regression discontinuity design (RDD) is a multiple-rating RDD (MRRDD). To date, four main methods have been used to estimate average treatment effects at the multiple treatment frontiers of an MRRDD: the “surface” method, the “frontier” method, the “binding-score” method, and the “fuzzy instrumental variables” method. This article uses a series of simulations to evaluate the relative performance of each of these four methods under a variety of different data-generating models. Focusing on a two-rating RDD (2RRDD), we compare the methods in terms of their bias, precision, and mean squared error when implemented as they most likely would be in practice—using optimal bandwidth selection. We also apply the lessons learned from the simulations to a real-world example that uses data from a study of an English learner reclassification policy. Overall, this article makes valuable contributions to the literature on MRRDDs in that it makes concrete recommendations for choosing among MRRDD estimation methods, for implementing any chosen method using local linear regression, and for providing accurate statistical inferences. 相似文献
2.
Gerald M. Gillmore 《Research in higher education》1977,7(2):187-189
Romney (1977) presented data from which he concluded that within student ratings of college instruction, the course that an instructor teaches is as important a determiner of resulting ratings as the instructor himself. Reanalysis of his data indicates that the course effect is actually quite small, a result that is consistent with earlier studies. 相似文献
3.
Student ratings and various instructional variables from a within-instructor perspective 总被引:1,自引:0,他引:1
The relationship between class size, instructional method, course level, reason for enrollment, and student ratings of instruction was assessed from a within-instructor perspective. Two hundred fifty-four pairs of courses taught by the same instructor were correspondingly identified and subjected to a stepwise multiple regression procedure. Only class size was found to be a significant predictor of ratings once individual differences between instructors were controlled, hence underlining the importance of (1) taking cognizance of the size of the course when using student ratings of instructors as a measure of teaching effectiveness, and (2) controlling for systematic variation due to instructor idiosyncracies in instructional research. 相似文献
4.
《Journal of research on educational effectiveness》2013,6(1):83-104
Abstract In the absence of a randomized control trial, regression discontinuity (RD) designs can produce plausible estimates of the treatment effect on an outcome for individuals near a cutoff score. In the standard RD design, individuals with rating scores higher than some exogenously determined cutoff score are assigned to one treatment condition; those with rating scores below the cutoff score are assigned to an alternate treatment condition. Many education policies, however, assign treatment status on the basis of more than one rating-score dimension. We refer to this class of RD designs as “multiple rating score regression discontinuity” (MRSRD) designs. In this paper, we discuss five different approaches to estimating treatment effects using MRSRD designs (response surface RD; frontier RD; fuzzy frontier RD; distance-based RD; and binding-score RD). We discuss differences among them in terms of their estimands, applications, statistical power, and potential extensions for studying heterogeneity of treatment effects. 相似文献
5.
Thomas Perry 《School Effectiveness & School Improvement》2017,28(1):22-38
Value-added (VA) measures are currently the predominant approach used to compare the effectiveness of schools. Recent educational effectiveness research, however, has developed alternative approaches including the regression discontinuity (RD) design, which also allows estimation of absolute school effects. Initial research suggests RD is a viable approach to measuring school effectiveness. The present study builds on this pioneering work by using RD and VA designs to estimate school effects at system and school level, comparing estimates from several measurement designs. The study uses a large English dataset (N = 148,135) spanning 342 schools, 10 local authorities, 6 consecutive school year groups (UK Years 3–9) across 3 years. RD is found to be a suitable approach for system-level absolute school effect estimates. Cross-sectional and longitudinal measures are found to lead to markedly different estimates when comparing individual schools. The results also reinforce the need to treat measures based on a single cohort with extreme caution. 相似文献
6.
Dr. Arie Rotem 《Research in higher education》1978,9(4):303-318
This article reports the results of an experimental study of the effects of students' evaluative feedback to university instructors. Data obtained indicated that feedback from students did not have any significant effects on the instructors' teaching performance and their perception of teaching. Some limitations of the study are discussed with suggestions for further exploration of the problem. 相似文献
7.
Pasen Robert M. Frey Peter W. Menges Robert J. Rath Gustave J. 《Research in higher education》1978,9(2):161-167
A manipulation of the instructions students received prior to completing the 7-item Endeavor Instructional Rating card differentially affected their ratings on two types of items. Specifically, when students were led to believe their ratings would have a strong impact on the instructor's career, they tended to be more lenient on items measuring rapport (i.e., the affective domain); this same effect was not observed for items measuring pedagogical skill (i.e., the cognitive domain). The different items on our instructional rating instrument appear to be measuring different things. One implication of this observation is that the inconsistent findings reported in past research on student ratings of instruction may be due to the differential mix of items from one instrument to another. When instructors are compared on ratings given them by students, unbiased interpretation requires that the multidimensional nature of teaching (and of the rating instrument) be considered. 相似文献
8.
Factor analysis of an instructor rating form administered to three successive student and teacher populations revealed a reasonably consistent factor structure across analyses. In one of the three administrations, students were asked to sign the evaluation form; in this case, substantial changes in proportions of common variance appeared for the first two factors when comparing anonymous versus nonanonymous conditions. Results are discussed in terms of methods for use of student ratings to improve instruction. 相似文献
9.
Sarah C. Wood 《Roeper Review》2013,35(3):194-204
This exploratory study considered the perceptions of parents and teachers regarding behaviors exhibited by gifted students who may have attention deficit hyperactivity disorder (ADHD) by examining their responses to the Conners 3 behavior rating scale. Statistical analysis revealed average scores in the ratings of parents and teachers in the areas of inattention, hyperactivity/impulsivity, executive functioning, and learning problems. Parent and teacher ratings of these students were not significantly correlated nor were there significant differences between parents and teachers on ratings of students. The need for further examination of the psychometric properties and appropriate use of the Conners 3 in diagnosis of twice-exceptional students, the need for normative data on gifted populations for the Conners 3, and a greater understanding of the differential display of ADHD in the gifted population were suggested. 相似文献
10.
The significance of circumstances for college students' ratings of their teachers and courses 总被引:3,自引:5,他引:3
Dr. Kenneth A. Feldman 《Research in higher education》1979,10(2):149-172
Although firm generalizations and conclusions cannot yet be drawn from the extant research on the effects on teacher and course ratings of the circumstances surrounding these evaluations, at least some studies have shown that college students' ratings of their teachers and courses are somewhat higher when students remain anonymous rather than identifying themselves, when the purported use of the ratings is an official or administrative one for use in salary, promotion, or tenure considerations rather than otherwise, and when the instructor is present rather than absent during the rating session. (The differences between each of these contrasted circumstances are usually rather small and do not inevitably appear across studies.) Certain variations in rating format have been found to make a difference in the ratings obtained, whereas other have not. From limited evidence, the exact timing (or occasion) of evaluation appears not to be important to ratings. Variability in sampling procedures, as it affects the composition of students available to complete rating forms, may or may not turn out to be a generally important element in ratings (as directly relevant data are collected). The analysis concludes with a discussion of (1) the presumed bias in ratings produced in certain of the rating conditions and (2) the more general issue of the comparability of ratings made in different circumstances of evaluation. 相似文献
11.
马晓峰 《泰州职业技术学院学报》2013,(5):19-21,38
目前,公立医院管理已逐渐过渡为一种新型而特殊的现代精细化管理模式。这种管理模式强调“以人为本”的管理理念,注重调动各方的积极性和创造力,注重人才培养和队伍建设。实践表明,全员考评在公立医院人才培养工作中,能发挥极其重要的内在驱动效应,能产生PDCA良性循环效应,扎实保障了公立医院人才培养工作的成功推进。 相似文献
12.
Alan Socha 《Assessment & Evaluation in Higher Education》2013,38(1):94-113
A teacher evaluation system can be threatening to faculty, especially if used for summative decisions. Therefore, it is important to obtain valid and pertinent information. Since students are extensively exposed to course elements, students’ evaluation of instruction should be one of several components in the teacher evaluation system. Since traditional methods, such as Cronbach’s alpha and ordinary least squares regression, do not address the hierarchical data of the classroom, the current study used the statistical techniques of confirmatory factor analysis and hierarchical linear modelling in order to properly investigate the reliability and validity of the Students’ Assessment of Instruction (SAI) instrument. Use of hierarchical linear modelling to analyse teacher evaluation instruments could not be found in the literature, although it has been used in educational settings. This study will illustrate its usefulness in determining what measures are related, either as evidence of validity or as a bias, to instructional effectiveness. Student responses were also compared with faculty self-evaluations, one indicator of effective teaching, in order to determine if the SAI does measure instructional effectiveness. Overall, the SAI was found to have good reliability and validity with relatively few biases and could be used to extract five distinguishable traits of instructional effectiveness. 相似文献
13.
如何根据非随机数据估计变量间的因果关系是社会科学研究中一个迫切的方法论问题。上世纪70年代,Rubin等人指出因果问题本质上是一个反事实的问题,认为某些统计方法可保证混淆变量和分组安排独立,并将这种方法推广到观察数据的分析中。倾向分数、工具变量和回归间断点是三种常用的方法,其中倾向分数居核心地位。以实际数据为例建立计算倾向分数的logistic模型,报告了模型的整体检验、预测变量的显著性检验和多重共线性检验、建立匹配组和分析结果报告。 相似文献
14.
Stephen A. Stumpf Richard D. Freedman Joseph C. Aguanno 《Research in higher education》1979,11(2):111-123
The relationships among several variables outside of the instructor's classroom control and student ratings of teaching effectiveness are investigated in a causal network. Student ratings are relatively independent of external variables. Students may be able to take into account more factors than generally assumed when they rate their instructors. 相似文献
15.
As a part of efforts to evaluate and monitor the increasing public investment in early childhood education, teachers are being asked to assess children's school readiness. In this study, preschool teachers and kindergarten teachers rated children's skills in three areas (kindergarten readiness, academic skills, and communication skills), and these ratings were compared with direct assessments of the children's skills. Ratings by both groups of teachers tended to be more highly related to basic skills, such as counting and number naming, than to abilities such as solving applied problems and using expressive and receptive vocabulary. Preschool teachers' ratings had a lower association with children's observed skills and abilities than kindergarten teachers' ratings. Ratings of children attending Head Start were systematically inflated, but this relationship was mediated to a significant extent by the teachers' levels of education. More educated teachers rated children in a manner consistent with the children's directly assessed skills. Implications of these findings for informing future efforts to assess school readiness by using teacher ratings are discussed. 相似文献
16.
徐鹰 《浙江教育学院学报》2014,(2):39-46,93
本研究采用混合研究法对CET-4作文评分人如何使用评分标准进行分析。26位CET-4作文评分人对30篇CET-4模拟作文评分,并提供3条按重要性排序的评分理由。研究结果显示:(1)虽然存在严厉度的差异,但是26位评分人之间的一致性比较好,且大部分评分人的自身一致性也较好。(2)部分评分人的评分理由呈现了单一化趋势。(3)评分人所给评分理由的71.91%体现了CET-4作文评分标准所规定的5个文本特征,说明大部分评分人对标准的理解和把握还是比较准确的。 相似文献
17.
教师评价研究的缘起、问题及发展趋势 总被引:58,自引:0,他引:58
教师评价研究作为教师管理制度建设的首要环节,其发展经历了一个长期的过程。目前,这一领域的研究表现出的主要问题有:不同功能类型的评价常常被混用;教学效能评价仍然占主导地位;评价内容结构不明确;评价内容的理论依据不足;过度注重学生学习结果等。教师评价研究的发展过程可以概括为三个阶段,其发展表现出一个基本趋势:更加重视教师在教育教学过程中的行为,并且更关注教师的教学反思过程和工作中的主动性等方面。 相似文献
18.
This study proposes and tests a multilevel structural model of school context, composition, and school leadership on school instructional practices and outcomes in elementary schools in a western state in the United States. We focus on direct and indirect relationships implied in our proposed model using an “added year of schooling” in reading and math as our primary school-level outcomes. Added-year effects, which result from a regression discontinuity design, represent a relatively new approach for describing how school factors influence outcomes. Our results suggest that, net of context and composition factors, improvement-focused school leadership directly affected subsequent school instructional practices and, in turn, instructional practices affected added-year outcomes. We discuss the findings in terms of their theoretical and practical implications for conducting further educational effectiveness research. 相似文献
19.
Kenneth A. Feldman 《Research in higher education》1978,9(3):199-242
From showing in a general way that there is room for course context to influence class (average) ratings of instruction, this review proceeds to a search for specific course characteristics that are associated with these ratings. Extant research has centered around five such characteristics: class size, course level, the electivity of the course, the particular subject matter of the course, and the time of day that the course is held. Although statistically significant zero-order relationships do not appear in every piece of research located for review, such relationships are more likely to be found than not for the first four of these characteristics. The associations may not be particularly strong, but rather clear-cut patterns do emerge. Of the studies reporting an association between size of class and class ratings, most find it to be inverse, although several studies show a curvilinear (U-shaped) relationship. Teacher (and course) ratings tend to be somewhat higher for upper division courses and elective courses. Compared to other instructors, those teaching humanities, fine arts, and languages tend to receive somewhat higher ratings. The possible reasons for these relationships are many and complex. A precise understanding of the contribution of course characteristics to the ratings of teachers (and the courses themselves) is hampered by two circumstances. Studies in which relevant variables are controlled are far fewer in number than are the studies in which only the zero-order relationships between course characteristics and ratings are considered. More importantly, existing multivariate studies tend to underplay or ignore the exact place of course characteristics in a causal network of variables. 相似文献
20.
公共性是政府雇员的根本属性,但从经济学角度分析,政府雇员还具有自利性,它决定了雇员具有追求自身利益最大化的倾向和动机。分析雇员的自利性,区分合理自利性与扩张自利性,并对扩张自利性进行伦理、制度的制约,才能够真正增强政府的行政能力。 相似文献