The purpose of this study is to describe a Many-Faceted Rasch (FACETS) model for the measurement of writing ability. The FACETS model is a multivariate extension of Rasch measurement models that can be used to provide a framework for calibrating both raters and writing tasks within the context of writing assessment. The use of the FACETS model for solving measurement problems encountered in the large-scale assessment of writing ability is presented here. A random sample of 1,000 students from a statewide assessment of writing ability is used to illustrate the FACETS model. The data suggest that there are significant differences in rater severity, even after extensive training. Small, but statistically significant, differences in writing- task difficulty were also found. The FACETS model offers a promising approach for addressing measurement problems encountered in the large- scale assessment of writing ability through written compositions.  相似文献   

本研究的目的是描述一个用于测量写作能力的多面Rasch(FACETS)模型。该FACETS模型是Rasch测量模型的多元变量拓展,它可为写作测评中的校标评分员和写作题目提供框架。本文展示了如何应用FACETS模型解决大型写作测评中遇到的测量问题。参加全州写作考试的1000个随机抽取的学生样本被用来显示该FACETS模型。数据表明即使经过强化训练,评分员的严格度有显著区别。同时,本研究还发现,写作题目难度的区分,虽然微小,却具有统计意义上的显著性。该FACETS模型为解决以作文测评写作能力的大型考试遇到的测量问题提供了一个有前景的途径。  相似文献   

This study investigated parent-reported receptivity towards the classroom environment and classroom outcomes. Classroom environment was based on a five-aspect model: (1) provision of information from the child; (2) beliefs about the school; (3) provision of information from teachers; (4) teachers' commitment to working with parents; and (5) confidence in communicating with teachers. Classroom outcomes were based on two aspects: (1) educational values (importance of schooling, involved with learning; seeing a future through learning, desire to learn, and importance of learning); and (2) learning outcomes (achieving, and views of child's engagement in school work). For each aspect, items were written in an ordered-by-difficulty pattern so that, for example, Item 2 involved Item 1 and ‘more’, making it conceptually ‘harder’ to agree with Item 2 than with Item 1. There were four Likert response categories (SDA, DA, A, and SA). Using the extended logistic model of Rasch, an interval-level, unidimensional scale was created with item difficulties for classroom environment aspects and classroom outcomes calibrated on the same scale as the receptivity measures. The sample consisted of 518 parents of students from three secondary schools in Western Australia. The item sample was 30. The proportion of observed variance considered true was 0.94. The items for each aspect were found to be ordered from ‘easy’ to ‘hard’ in line with the hypothesised model of receptivity and the data fitted the measurement model well. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

This paper reports the use of an online student evaluation system, Course Experience on the Web (CEW), in a physiotherapy program to improve their Course Experience Questionnaire (CEQ) results. CEW comprises a course survey instrument modelled on the CEQ and a tailored unit survey instrument. Closure of the feedback loop is integral in the CEW system. Analysis of the data shows that the students’ evaluation in their final year of the program is closely correlated with their CEQ results. Increases in the CEQ scores from 2001–04 included an increase in the Good Teaching Scale (27.5), Generic Skills Scale (10.3) and Overall Satisfaction Index (29.3). By using CEW, academics at the School of Physiotherapy were able to determine students’ perceptions during the course, make changes to teaching and learning, where appropriate, in a timely manner and, as a result, the CEQ scores were improved markedly.  相似文献   

Two studies considered whether the Course Experience Questionnaire's (CEQ) question format was the most appropriate for the CEQ's purpose. In the first, comparisons were made against two alternative but minimalist variations on the standard format. None of three tests showed the standard format to be superior. In the second, students reported the thinking used in deciding their responses on a sample of specific CEQ questions. Those reports uniformly showed responses to be decided by the recall of particular, concrete and personal experiences prompted by a question, and not by the overviewing implicitly assumed by the standard format. The implications drawn are that the systematic trialing of alternative question forms could well result in improved performance of the CEQ as an instrument, and that those alternative forms should probably be constrained to those directly prompting the recall of personal experience, but in more guided fashion than seems presently to occur.  相似文献   

This paper reports on the use of the Course Experience Questionnaire (CEQ) as an instrument to monitor the medical programme at the University of Sydney and in particular to measure improvements in teaching quality with the introduction of the new graduate-entry problem-based programme. In addition, it raises the more general issue of interpretation of CEQ results in courses designed around problembased learning (PBL). Students' perceptions of teaching quality were sought using a whole class questionnaire survey and small group interviews. Students in the new programme rated their course more highly than did students in the old programme with respect to good teaching, appropriate assessment, generic skills and overall satisfaction. These improvements did not hold with respect to the clarity of goals and standards, nor for perceptions of an appropriate workload. The results are interpreted in context and it is argued that particular items in the CEQ do not reflect the educational philosophy or the instructional processes of PBL programmes.  相似文献   

The psychometric properties and multigroup measurement invariance of scores across subgroups, items, and persons on the Reading for Meaning items from the Georgia Criterion Referenced Competency Test (CRCT) were assessed in a sample of 778 seventh-grade students. Specifically, we sought to determine the extent to which score-based inferences on a high stakes state assessment hold across several subgroups within the population of students. To that end, both confirmatory factor analysis (CFA) and Rasch (1980 Rasch, G. 1980. Probabilistic models for some intelligence and attainment tests, Chicago: The University of Chicago Press (Original work published 1960).  [Google Scholar]) models were used to assess measurement invariance. Results revealed a unidimensional construct with factorial-level measurement invariance across disability status (students with and without specific learning disabilities), but not across test accommodations (resource guide, read-aloud, and standard administrations). Item-level analysis using the Rasch Model also revealed minimal differential item functioning across disability status, but not accommodation status.  相似文献   

The Course Experience Questionnaire (CEQ) is a 36-item instrument that is intended to measure six different aspects of students’ perceptions of the academic quality of their programmes. It has been widely used in Western countries, and it has also been used in non-Western countries, including China, Hong Kong, Japan and Pakistan. Nevertheless, in the latter countries, it has sometimes not been possible to identify the full range of constructs that were supposed to be measured by the original CEQ. We translated the CEQ into Bengali and administered this to 552 science students at 15 higher secondary schools in West Bengal, India. A confirmatory factor analysis found that their responses provided a poor fit to the original six-factor model of the CEQ. An exploratory factor analysis identified just four constructs, which reflected good teaching, generic skills, student support and appropriate workload. The items with salient loadings on the four factors were used to construct four scales. The students’ scores on three of the four scales showed satisfactory levels of internal consistency. A factor analysis of their scores on all four scales yielded one overarching factor that could be interpreted as a measure of perceived academic quality. A reduced version of the CEQ consisting of the 30 items that constitute these four scales can be recommended as a measure of students’ perceptions of the academic quality of programmes in West Bengal.  相似文献   


Multilevel Rasch models are increasingly used to estimate the relationships between test scores and student and school factors. Response data were generated to follow one-, two-, and three-parameter logistic (1PL, 2PL, 3PL) models, but the Rasch model was used to estimate the latent regression parameters. When the response functions followed 2PL or 3PL models, the proportion of variance explained in test scores by the simulated student or school predictors was estimated accurately with a Rasch model. Proportion of variance within and between schools was also estimated accurately. The regression coefficients were misestimated unless they were rescaled out of logit units. However, item-level parameters, such as DIF effects, were biased when the Rasch model was violated, similar to single-level models.  相似文献   

高职毕业生能否顺利就业取决于两个决定性的因素,一是从学业中获得的专业能力,二是由教师的引导与学生的自我锻炼所获得的语言表达能力、应变能力、实践能力和沟通协作能力等社会能力。对于专业能力而言,主要通过课程体系中课程设置的合理安排来解决;对于社会能力而言,可以通过两个途径来提高,一是将社会能力培养嵌入专业课的授课过程,二是鼓励和倡导学生参加各种社团活动。  相似文献   

Assessing and improving the quality of undergraduate teaching is an important issue in China. Using the Course Experience Questionnaire, this study examined the quality of undergraduate teaching by investigating the relationships between students’ course experience, the learning outcomes demonstrated by the students and the learning environment. Two thousand and forty-three second-year students participated in a questionnaire survey. The results indicated that different aspects of the students’ course experience variously affected learning outcomes, such as overall satisfaction with the course, academic efficacy and the development of generic skills. These results reflected the characteristics of undergraduate teaching in China, and highlighted the need to enhance student autonomy and self-study. In addition, the appropriateness of learning resources and the level of academic freedom in the institutions were found to more powerfully influence students’ course experience than the provision of supportive facilities and services. These findings have implications for improving undergraduate teaching and its quality assurance in China.  相似文献   

In recent years, measuring the efficiency and effectiveness of higher edu cation has become a major issue. National governments are now demanding greater public accountability for funds invested in the sector, resulting in the emergence of various performance indicators relating to both teaching and research. The Course Experience Questionnaire (CEQ) was developed to measure the perceived quality of teaching in degree programmes. It evolved from research that identified curriculum, teaching and assessment as key determinants of students' approaches to learning and, in turn, the quality of their learning outcomes. The CEQ data are intended for use in making comparisons within fields of study over time and/or across institutions. However, no European study has reported on its suitability to evaluate teaching within an accounting programme. This paper outlines the development of the CEQ and confirms its reliability and construct validity for use in the accounting discipline in an Irish context.  相似文献   

教师主教一门专业课兼教一门相关课程有利于提高教学效果,本讨论了药理学教师兼教生理学课程的体会。  相似文献   

Rasch模型在研究生入学考试质量分析中的应用   总被引:1,自引:0,他引:1  
运用Rasch模型对2010年全国硕士研究生入学考试心理学专业基础综合考试进行分析。结果表明,该试题总体上是一套高质量的测验,试题的内容覆盖了所有能力水平的考生,且能够较好地区分考生的能力水平,达到了预期的选拔目的。但通过Rasch分析也发现,在试题中有个别题目没有达到预期的测量目标,可以考虑在今后的工作中对其做出相应的修改。基于Rasch模型的试题分析能为考生能力和试题质量分析提供更多的测量信息。  相似文献   

Rasch模型和IRT在学生成就测验统计分析中的对比研究   总被引:1,自引:0,他引:1  
Rasch模型和项目反应理论的诞生推进了社会科学领域研究方法的变革。大多数学者认为,Rasch模型就是三参数IRT模型的特例。其实,Rasch模型不同于项目反应理论,其数据必须符合模型的先验理论。研究利用基于这两种理论假设开发的软件Winsteps和Multilog对学生成就测验进行统计分析,旨在揭示两种理论模型数据分析结果的异同之处,并探讨Winsteps软件在教育统计中的应用。  相似文献   

主观性测试中,评分员差异是影响测试信度、效度和公平性的重要因素。本文采用多面Rasch模型考察8位评分员对记叙文和议论文两种体裁各60篇作文的评分情况。结果表明,评分员对不同体裁作文的评分存在不一致性:在评分员层面上,评分员的严厉度基本不受体裁的影响,但在评分员的信度与内在一致性方面,议论文评分好于记叙文评分;在评分量表层面上,评分员在评定语言和内容项目上,议论文比记叙文严格,而在条理项目上,议论文比记叙文宽松,并且议论文高分的使用频率比记叙文高。本文还就评分员评分的不一致性的原因进行了探讨,以求为降低评分偏差提供参考。  相似文献   

为克服经典测量理论存在的测量依赖性和样本依赖性,本研究将Rasch模型应用于小学六年级学生科学素养评测的质量分析中,从整体质量检验、单维性检验、怀特图、单题质量分析、气泡图等方面介绍了Rasch模型在质量分析中的应用。同时指出该评测设计的题目信效度高、区分度合理,绝大多数题目达到了测量预期。Rasch模型在评测设计中的应用,为评测设计提供了一定的测量质量数据的参考。  相似文献   

实验教学在机械工程测试技术课程中具有非常重要的作用,针对实验教学中存在的问题,对实验内容、设备以及实验教学模式进行了一系列的探讨,对提高学生学习本课程的兴趣,增强对理论知识的理解,培养具有创新能力的高级应用型人才具有一定的意义。  相似文献   

The purpose of the current study is to examine the performance of four information criteria (Akaike's information criterion [AIC], corrected AIC [AICC] Bayesian information criterion [BIC], sample-size adjusted BIC [SABIC]) for detecting the correct number of latent classes in the mixture Rasch model through simulations. The simulation study manipulated various class-distinction features (percentages of class-variant items, magnitudes, and patterns of item difficulty differences) and mixing proportions, assuming that a mixture Rasch model with two latent classes was the true model. Unlike previous studies that showed BIC's superiority to other indices, our findings from this study suggested that the four information criteria had differential performance depending on the percentage of class-variant items and the magnitude and pattern of item difficulty differences under a two-class structure. Furthermore, the present study revealed that AICC and SABIC generally performed as good as or better than their counterparts, AIC and BIC, respectively, for the class-class structure with a sample of 3,000.  相似文献   

