首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.
The effects of rating scale format (behaviorally anchored vs. Likert) and rater training on leniency and halo in student ratings of instruction were investigated. The subjects (N=269) were students enrolled in required courses at a graduate theological seminary in the Southwest United States. A repeated measures design controlling for teacher and course was used. Findings indicated: (a) training was effective in reducing leniency and halo in ratings from both instruments; (b) trained raters exhibited less leniency on two rating dimensions when using behaviorally anchored rating scales (BARS's) than when using the Likert scale; and (c) trained raters exhibited less halo when using the Likert than when using the BARS. The findings demonstrate the importance of focusing efforts to improve quality of ratings on the students rather than on the format of the instrument.Presented at the Twenty-Eighth Annual Forum of the Association for Institutional Research, Phoenix, Ariz., May 1988.  相似文献   

2.
The convergent and discriminant validity of three measures of the concepts of aspiration level, ability, achievement, adjustment, and dominance were examined in the context of a multitrait-multimethod matrix. Self-reports and peer-reports on 75 Ss were employed as two measures of each trait. In addition, aspiration level was measured by the Edwards Personal Preference Schedule (EPPS) Nach scale, dominance by the EPPS (dom scale), achievement by cumulative college grade point ratio (GPR), ability by the Ohio State Psychological Examination (OSPE), and adjustment by the Bell Adjustment Inventory. Of the paper and pencil instruments, only the OSPE and EPPS (dominance scale) exhibited satisfactory convergent validity. No measure met all the requirements of discriminant validity. The desirability of establishing adequate validational evidence prior to using “trait” measures in studies relating theoretical variables was emphasized.  相似文献   

3.
This study was undertaken to examine the social perceptual skill deficit theory in explaining the low peer acceptance of children with learning disabilities. The quality of tests measuring social perception was also examined. Thirty 9- to 12-year-old children with learning disabilities and a matched control group were given two measures of social perception: a laboratory task and a behavior rating scale. The behavior rating scale was completed by the children's teachers. In addition, the Peer Acceptance Scale (Bruininks, Rynders, & Gross, 1974) was administered to assess peer status. Results showed that the children with learning disabilities differed significantly from their nondisabled peers on each of the three measures-the children with learning disabilities obtained lower social perception and peer acceptance scores. However, the relationships between sociometric status and social perception varied as a function of task. A small but significant correlation wa found between the behavior rating scale and peer status. The laboratory task was not correlated with either the behavior rating scale or peer status. Results are discussed in terms of the psychometric properties of laboratory versus naturalistic measures of social perception and the importance of establishing the external validity of social skill measures by correlating them with outcome measures such as peer status.  相似文献   

4.
大学英语阅读小班合作学习实证研究   总被引:2,自引:0,他引:2  
研究采用合作学习策略中的“小组成绩分组法”,为期10周,研究对象为48名一年级非英语专业本科生。研究工具为成绩测验、态度量表、合作学习行为评估表以及访谈。学习过程为研究过程前后成绩测验,合作学习策略讲解、分组、适应性学习和正式学习。学习结束后,对两次测验成绩作对比分析和显著性分析,以检验合作学习策略对提高大学生英语阅读能力的效果。研究结果表明,合作学习策略能有效提高大学非英语专业学生的英语阅读能力,80%的参与者对合作学习持肯定态度,该策略除了可以明显提高非英语专业大学生的英语阅读能力外,还能显著提高他们的合作意识和团队精神。  相似文献   

5.
主观题评分标准研究   总被引:1,自引:0,他引:1  
本文以2006年上海市高考政治学科论述题评分标准为例,从三个方面研究如何评价主观题评分标准的优劣,即每个评分项是否具有相对独立性;根据若干评分项的结果是否能够推测出考生的综合论述的能力;每个评分项等第划分是否合理。因子分析表明该主观题四个评分项具有单维性,一个因子可以解释为考生的综合论述能力。相关分析表明四个评分项均具有相对独立性,对推测考生的综合论述能力起到了彼此独立的作用。Rasch评分量表模型分析显示,各评分项等级划分基本合理,但个别等级出现信息量不足,在此基础上,提出了改进评分标准的若干建议。  相似文献   

6.
Cognitive pretesting (CP) is an interview methodology for pretesting the validity of items during the development of self-report instruments. The present research evaluates a systematic approach to the analysis of CP data. Materials and procedures were developed to rate self-report item performance with CP interview text data. Five raters were trained in the application of that system. Estimates of inter-rater reliability found acceptable to substantial levels of inter-rater agreement. Results from the present study suggest that excellent inter-rater reliability can be achieved in the evaluation of CP data. Guidelines for systematically rating the qualitative data collected using CP methods are provided. Future research should focus on empirical demonstrations of how such rating procedures can lead to improvements in self-report instruments.  相似文献   

7.
This paper presents a globally oriented scoring sheet, reference guide, and rating scale for facilitating clinical hypotheses from children's Kinetic School Drawings (KSDs) and further empirical evaluations of the KSD technique. The paper also provides information regarding the construction of these instruments, along with some preliminary findings in terms of the procedures' reliability and discriminant validity.  相似文献   

8.
Using nine years of student evaluation of teaching (SET) data from a large US research university, we examine whether changes to the SET instrument have a substantial impact on overall instructor scores. Our study exploits four distinct natural experiments that arose when the SET instrument was changed. To maximise power, we compare the same course/instructor before and after each of the four changes occurred. We find that switching from in-class, paper course evaluations to online evaluations generates an average change of ?0.14 points on a five-point scale, or 0.25 standard deviations (SDs) in the overall instructor ratings. Changing labelling of the scale and the wording of the overall instructor question generates another decrease in the average rating: ?0.15 of a point (0.27 SDs). In contrast, extending the evaluation period to include the final examination and offering an incentive (early grade release) for completing the evaluations do not have a statistically significant effect on the overall instructor rating. The cumulative impact of these individual changes is ?0.29 points (0.52 SDs). This large decrease shows that SET scores are not comparable over time when instruments change. Therefore, administrators should measure and account for such changes when using historical benchmarks for evaluative purposes (e.g. appointments and compensation).  相似文献   

9.
The purpose of the study was to investigate relationships between student ratings of college teaching using four types of student rating instruments and pre- vs. post-student achievement gains in 36 sections of an undergraduate analytic geometry and calculus course. Student rating instruments used varied according to type of items (high vs. low inference) and focus (students rating their own perceived growth vs. rating the instructor). Data were collected on 799 students (66% freshmen; 16% sophomore; and 15% juniors) at the University of Florida, and relationships were analyzed using the Pearson product-moment correlation technique. Significant relationships were not found between student ratings and student achievement.  相似文献   

10.
The controversy over what is an appropriate early childhood curriculum has created a need for research instruments designed to measure classroom practices. This article reports on the development of a new observational measure based on the Guidelines for Developmentally Appropriate Practices of the National Association for the Education of Young Children (NAEYC). The Classroom Practices Inventory (CPI) is a 26-item rating scale tapping the curricular emphasis and emotional climate of programs for 4- and 5-year-old children. The scale demonstrated a high degree of internal consistency. Over half the measure's variance was accounted for by a factor tapping encouragement of curiosity, creativity, and provision of concrete materials. In a study of 10 preschool programs, CPI scores correlated significantly with teachers' and parents' educational attitudes. Modest relationships were found between the CPI scores of children's preschools and measures of academic skills, creativity, and anxiety. The CPI appears to be a promising measure for critically examining the concept of developmentally appropriate practices in early childhood education.  相似文献   

11.
Justice-related situations are a part of students? everyday life. In order to test the antecedents, correlates, and consequences of (in)justice in school, valid measures of justice are needed. To our knowledge, this is the first study to develop an observer low inference rating instrument that can be applied to measure justice in the primary classroom. In two pre-studies, justice-relevant situations in the classroom were extracted and observable indicators for these situations were developed. In the main study, this instrument was used to observe 208 primary school students with regard to their experiences of justice or injustice. In addition to this, other measures of justice were developed to examine the convergence between observer low inference ratings of classroom justice and high inference rating instruments for teachers, students, and external observers.Factor analyses and correlations between the different indices of the observer low inference rating and the high inference rating items suggested that incidents of justice and injustice in the classroom do not tend to co-occur frequently. Teachers do not appear to have a general tendency to treat a child more or less justly across a large number of situations.The findings suggest that a comprehensive assessment of classroom justice requires a multi-method approach where the justice ratings of students, teachers and external observers are all taken into consideration.  相似文献   

12.
13.
有机化学实验是有机化学教学中必不可少的组成部分,而由于近几年该课程人数的增多,实验规模也随之扩大,每次实验需要消耗大量试剂和使用大量实验仪器,同时也有大量的污染物排放.为了节约资源、减少环境污染,从有机化学实验改革、实验产物的回收、与周边企业联合、实验室的管理等方面探索了有机化学实验室的经济化与绿色化建设.  相似文献   

14.
Individual empirical studies of motivation show little divergent validity of various factors and call for better measures, especially multidimensional instruments. The same conclusions were reached from a meta analysis of 40 motivation studies by Uguroglu and Walberg (1979). After a study of approximately 50 instruments that measured motivation constructs of social, emotional, and physical self-concept; locus of control; and achievement motivation, among others, a 23-anchored-item questionnaire using a five-point scale was developed that included these factors. The instrument was administered in May of the first and second year to 115 students in grades three through eight. The purpose of the research was to operationalize and field test a motivation instrument using multidimensional measures; also to consider whether any change would occur in the correlation of a multidimensional instrument to various achievement measures over unidimensional ones. Results of the study show the correlations of motivation to academic achievement, test-retest reliability, and the predictive validity of a multidimensional instrument.  相似文献   

15.
Behavior rating scales are popular assessment tools but more research is needed on the preschool versions of the instruments, particularly with referred samples of preschoolers. This study examined the comparability of results from parent ratings on the preschool versions of the Child Behavior Checklist (CBCL/1.5‐5, Achenbach & Rescorla, 2000) and the Clinical Assessment of Behavior‐Parent form (CAB‐P, Bracken & Keith, 2004) with 74 clinically referred preschoolers. While pairs of similarly named scales from the two instruments received significant correlations, mean scores from the CBCL/1.5‐5 were significantly higher than those from the CAB‐P. Classification consistency was a concern as well. School psychologists are urged to be cautious in their interpretation of results from preschool behavior rating scales.  相似文献   

16.
17.
An important purpose of student evaluation of teaching is to inform an educator’s reflection about the strengths and weaknesses of their teaching approaches. Quantitative instruments are one way of obtaining student responses. They have traditionally taken the form of surveys in which students provide their responses to various statements using item-by-item agree/disagree ratings. Previous research has identified shortcomings of such rating scales, including response bias and the associated lack of discrimination amongst the items evaluated. In this paper, best–worst scaling is proposed as a novel method for quantitative teaching evaluation. The way in which best–worst scaling can be used in this context is illustrated in three different applications. Two applications demonstrate how it can be used for evaluations in a small-size classroom environment. The third application is a broader evaluation of university courses on a larger scale. In comparison with conventional rating scales, the best–worst scaling approach enables better highlighting of the differences between evaluation items. In doing so, it can provide enhanced guidance to educators in their reflection about their teaching. Moreover, implementation and analysis of a best–worst scaling evaluation is relatively straightforward, which establishes it a feasible method for teaching practitioners and researchers.  相似文献   

18.
Longitudinal studies offer unique opportunities to identify the specificity variance in the components of a psychometric scale that is administered repeatedly. This article discusses a procedure for evaluation of the relationship between true scale scores and criterion variables uncorrelated with measurement errors in longitudinally presented measures comprising unidimensional multicomponent instruments. The approach provides point and interval estimates of the true scale criterion validity with respect to a criterion that is assessed once or repeatedly, as well as a means for testing temporal stability in this validity. The outlined method is based on an application of the latent variable modeling methodology, is readily applicable with popular software, and is illustrated using empirical data.  相似文献   

19.
Three separate studies focusing on convergent and discriminant validity evidence for the Home and Community Social Behavior Scales are presented. The HCSBS is a 65‐item social behavior‐rating scale for use by parents and caretakers of children and youth ages 5–18. It is a parent‐rating version of the School Social Behavior Scales. Within these studies, relationships with five behavior‐rating scales were examined: the Social Skills Rating System, Conners Parent Rating Scale–Revised‐Short Form, Child Behavior Checklist, and the child and adolescent versions of the Behavior Assessment System for Children. HCSBS Scale A, Social Competence, evidenced strong positive correlations with measures of social skills and adaptability, strong negative correlations with measures of externalizing behavior problems, and modest negative correlations with measures of internalizing and atypical behavior problems. HCSBS Scale B, Antisocial Behavior, evidenced strong positive correlations with measures of externalizing behavior problems, modest positive correlations with measures of internalizing and atypical behavior problems, and strong negative correlations with measures of social skills and adaptability. These results support the HCSBS as a measure of social competence and antisocial behavior of children and youth. © 2001 John Wiley & Sons, Inc.  相似文献   

20.
提高高校大型仪器设备利用率及使用效益的探讨   总被引:15,自引:5,他引:15  
该文着重对高校大型仪器设备的使用和管理现状进行了深入的探讨,并对如何提高高校大型仪器设备的利用率和使用效益,提出了几点建议和改进措施。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号