首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
The Multidimensional School Anger Inventory–Revised (MSAI-R) is a measurement tool to evaluate high school students' anger. Its psychometric features have been tested in the USA, Australia, Japan, Guatemala, and Italy. This study investigates the factor structure and psychometric quality of the Persian version of the MSAI-R using data from an administration of the inventory to 585 Iranian high school students. The study adopted the four-factor underlying structure of high school student anger derived through factor analysis in previous validation studies, which consists of: School Hostility, Anger Experience, Positive Coping, and Destructive Expressions. Confirmatory factor analysis of this four-factor model indicated that it fit the data better than a one-factor baseline model, although the fit was not perfect. The Rasch model showed a very high internal consistency among items, with no item misfitting; however, our results suggest that to represent the construct sufficiently some items should be added to Positive Coping and Destructive Expression. This finding is in agreement with Boman, Curtis, Furlong, and Smith's Rasch analysis of the MSAI-R with an Australian sample. Overall, the results from this study support the psychometric features of the Persian MSAI-R. However, results from some test items also point to the dangers inherent in adapting the same test stimuli to widely divergent cultures.  相似文献   

2.
RCMLM模型是基于Rasch测量理论的通用拓展模型。利用RCMLM模型对一份普通高中数学试卷进行不同性别的DIF分析。结果表明:该模型可对具有二分计分和多分计分的试题同时进行DIF分析,避免了以往分别对两种计分方式试题进行DIF分析的弊端,保持了试卷的完整性,使DIF分析结果更加有效。  相似文献   

3.
Numerous researchers have proposed methods for evaluating the quality of rater‐mediated assessments using nonparametric methods (e.g., kappa coefficients) and parametric methods (e.g., the many‐facet Rasch model). Generally speaking, popular nonparametric methods for evaluating rating quality are not based on a particular measurement theory. On the other hand, popular parametric methods for evaluating rating quality are often based on measurement theories such as invariant measurement. However, these methods are based on assumptions and transformations that may not be appropriate for ordinal ratings. In this study, I show how researchers can use Mokken scale analysis (MSA), which is a nonparametric approach to item response theory, to evaluate rating quality within the framework of invariant measurement without the use of potentially inappropriate parametric techniques. I use an illustrative analysis of data from a rater‐mediated writing assessment to demonstrate how one can use numeric and graphical indicators from MSA to gather evidence of validity, reliability, and fairness. The results from the analyses suggest that MSA provides a useful framework within which to evaluate rater‐mediated assessments for evidence of validity, reliability, and fairness that can supplement existing popular methods for evaluating ratings.  相似文献   

4.
When rating scales are used in different countries, thorough investigation of the psychometric properties is needed. We examined the internal structure of the Finnish translated Behavioral and Emotional Rating Scale-2 (BERS-2) using Rasch and confirmatory factor analysis approaches with a sample of youth, parents, and teachers. The results suggested that the Finnish translated BERS-2 has acceptable measurement properties and is suitable for use in Finnish schools. Results highlighted the issue that there is a need to consider cross-cultural aspects when introducing new measures in another culture. Directions for future research are also discussed in light of present findings.  相似文献   

5.
In the article "Examining Rater Errors in the Assessment of Written Composition With a Many-Faceted Rasch Model" (JEM, Volume 31, Number 2, Summer 1994), the data presented in Figure 3 may be misleading. The "four clear spikes" (p. 106) that appear in Figure 3 were highlighted by the automatic scaling procedure used by the computer program that generated this histogram; as is well known, the use of different scaling units would yield histograms with different shapes (Moore & McCabe, 1993). For example, when the same data are presented as a bar chart (see Figure 1 below) rather than as a histogram, the four spikes are not evident. As graphical procedures become more readily available to measurement researchers, additional research and discussion are needed regarding standards for evaluating data displays that do not simply reproduce the actual data values.  相似文献   

6.
In this digital ITEMS module, Dr. Jue Wang and Dr. George Engelhard Jr. describe the Rasch measurement framework for the construction and evaluation of new measures and scales. From a theoretical perspective, they discuss the historical and philosophical perspectives on measurement with a focus on Rasch's concept of specific objectivity and invariant measurement. Specifically, they introduce the origins of Rasch measurement theory, the development of model‐data fit indices, as well as commonly used Rasch measurement models. From an applied perspective, they discuss best practices in constructing, estimating, evaluating, and interpreting a Rasch scale using empirical examples. They provide an overview of a specialized Rasch software program (Winsteps) and an R program embedded within Shiny (Shiny_ERMA) for conducting the Rasch model analyses. The module is designed to be relevant for students, researchers, and data scientists in various disciplines such as psychology, sociology, education, business, health, and other social sciences. It contains audio‐narrated slides, sample data, syntax files, access to Shiny_ERMA program, diagnostic quiz questions, data‐based activities, curated resources, and a glossary.  相似文献   

7.
《教育实用测度》2013,26(3):171-191
The purpose of this study is to describe a Many-Faceted Rasch (FACETS) model for the measurement of writing ability. The FACETS model is a multivariate extension of Rasch measurement models that can be used to provide a framework for calibrating both raters and writing tasks within the context of writing assessment. The use of the FACETS model for solving measurement problems encountered in the large-scale assessment of writing ability is presented here. A random sample of 1,000 students from a statewide assessment of writing ability is used to illustrate the FACETS model. The data suggest that there are significant differences in rater severity, even after extensive training. Small, but statistically significant, differences in writing- task difficulty were also found. The FACETS model offers a promising approach for addressing measurement problems encountered in the large- scale assessment of writing ability through written compositions.  相似文献   

8.
This study compares the Rasch item fit approach for detecting multidimensionality in response data with principal component analysis without rotation using simulated data. The data in this study were simulated to represent varying degrees of multidimensionality and varying proportions of items representing each dimension. Because the requirement of unidimensionality is necessary to preserve the desirable measurement properties of Rasch models, useful ways of testing this requirement must be developed. The results of the analyses indicate that both the principal component approach and the Rasch item fit approach work in a variety of multidimensional data structures. However, each technique is unable to detect multidimensionality in certain combinations of the level of correlation between the two variables and the proportion of items loading on the two factors. In cases where the intention is to create a unidimensional structure, one would expect few items to load on the second factor and the correlation between the factors to be high. The Rasch item fit approach detects dimensionality more accurately in these situations.  相似文献   

9.
本研究的目的是描述一个用于测量写作能力的多面Rasch(FACETS)模型。该FACETS模型是Rasch测量模型的多元变量拓展,它可为写作测评中的校标评分员和写作题目提供框架。本文展示了如何应用FACETS模型解决大型写作测评中遇到的测量问题。参加全州写作考试的1000个随机抽取的学生样本被用来显示该FACETS模型。数据表明即使经过强化训练,评分员的严格度有显著区别。同时,本研究还发现,写作题目难度的区分,虽然微小,却具有统计意义上的显著性。该FACETS模型为解决以作文测评写作能力的大型考试遇到的测量问题提供了一个有前景的途径。  相似文献   

10.
A classroom’s social environment and student dispositions towards social interaction together exert a substantial influence on academic outcomes. The strength of this effect is highlighted by research showing the positive effect of cooperative learning on student achievement, but can also be seen in the contribution that student social dispositions, specifically one’s disposition toward helping others (i.e. prosocial), has on individual achievement. The current study sought to assess the psychometric properties of the original Cooperative Classroom Environment Measure (CCEM) and refine the measure to increase its validity for use in the classroom. The CCEM was developed to provide information to educators about factors in the classroom environment contributing to student prosociality. The original form was answered by 431 undergraduate students enrolled in an introductory life science class. Following data collection, both exploratory factor analysis (EFA) and Rasch analysis were used to remove problematic items and generate a refined form. The psychometric properties of this refined form were examined using Rasch and confirmatory factor analysis, and supported the presence of six (out of eight originally hypothesized) subscale constructs analogous to those influencing prosociality in other contexts. Additional evidence showed the presence of a single prominent underlying latent factor (termed Prosocial) that could account for significant variance in all subscale constructs. These findings provided preliminary evidence for the use of both the CCEM subscales and whole survey measures for investigations into optimizing classroom social environments for prosocial action.  相似文献   

11.
The term measurement disturbance has been used to describe systematic conditions that affect a measurement process, resulting in a compromised interpretation of person or item estimates. Measurement disturbances have been discussed in relation to systematic response patterns associated with items and persons, such as start‐up, plodding, boredom, or fatigue. An understanding of the different types of measurement disturbances can lead to a more complete understanding of persons or items in terms of the construct being measured. Although measurement disturbances have been explored in several contexts, they have not been explicitly considered in the context of performance assessments. The purpose of this study is to illustrate the use of graphical methods to explore measurement disturbances related to raters within the context of a writing assessment. Graphical displays that illustrate the alignment between expected and empirical rater response functions are considered as they relate to indicators of rating quality based on the Rasch model. Results suggest that graphical displays can be used to identify measurement disturbances for raters related to specific ranges of student achievement that suggest potential rater bias. Further, results highlight the added diagnostic value of graphical displays for detecting measurement disturbances that are not captured using Rasch model–data fit statistics.  相似文献   

12.
In this study, we used Rasch model analyses to examine (1) the unidimensionality of the alphabet knowledge construct and (2) the relative difficulty of different alphabet knowledge tasks (uppercase letter recognition, names, and sounds, and lowercase letter names) within a sample of preschoolers (n = 335). Rasch analysis showed that the four components of alphabet knowledge did work together as a unidimensional construct, indicating all alphabet tasks administered were measuring the same underlying skill. With regard to difficulty of tasks, letter recognition was easier than letter naming, which in turn was easier than letter sounds, and uppercase letter names were easier than lowercase letter names. Most notably, most of the alphabet tasks overlapped, and the Rasch models for the single tasks were no more reliable than the combined measure. This suggests that these alphabetic tasks do not measure distinct skills but are instead indicators of a single ability. Consequently, we support the conceptualization of alphabet knowledge as a unitary construct, and suggest that those assessing and teaching alphabet knowledge in preschool use tests and methods that combine the various alphabetic tasks rather than separating them. These combined assessments will be more likely to capture the range of abilities within a preschool sample and avoid the floor and ceiling effects that have so often complicated early literacy research.  相似文献   

13.
Rasch测量原理及在高考命题评价中的实证研究   总被引:1,自引:1,他引:1  
王蕾 《中国考试》2008,(1):32-39
Rasch测量是当前教育与心理测量中具有客观等距量尺的测量。克服了经典测量的测验工具依赖和样本依赖的局限。本文通过介绍Rasch测量原理及其在高考命题评价考生抽样数据分析上的具体应用,为教育决策者和命题者提供了直观的Rasch测量对高考命题评价的量化图形表现形式。希望Rasch测量能在高考抽样数据分析中为命题量化评价提供新的、有价值的思考方式,能被教育决策者和命题者认同和有效使用。  相似文献   

14.
为克服经典测量理论存在的测量依赖性和样本依赖性,本研究将Rasch模型应用于小学六年级学生科学素养评测的质量分析中,从整体质量检验、单维性检验、怀特图、单题质量分析、气泡图等方面介绍了Rasch模型在质量分析中的应用。同时指出该评测设计的题目信效度高、区分度合理,绝大多数题目达到了测量预期。Rasch模型在评测设计中的应用,为评测设计提供了一定的测量质量数据的参考。  相似文献   

15.
This article compares the invariance properties of two methods of psychometric instrument calibration for the development of a measure of wealth among families of Grade 5 pupils in five provinces in Vietnam. The measure is based on self-reported lists of possessions in the home. Its stability has been measured over two time periods. The concept of fundamental measurement, and the properties of construct and measurement invariance have been outlined. Item response modelling (IRM) and confirmatory factor modelling (CFM) as comparative methodologies, and the processes used for evaluating these, have been discussed. Each procedure was used to calibrate a 23-item instrument with data collected from a probability sample of Grade 5 pupils in a total of 60 schools. The two procedures were compared on the basis of their capacity to provide evidence of construct and measurement invariance, stability of parameter estimates, bias for or against sub samples, and the simplicity of the procedures and their interpretive powers. Both provided convincing evidence of construct invariance, but only the Rasch procedure was able to provide firm evidence of measurement invariance, parameter stability and a lack of bias across samples.  相似文献   

16.
The purpose of this study was to analyze and assess the Jordan National Test for Controlling the Quality of Science Instruction (NTCQSI) from the perspective provided by Rasch measurement. The test was administered on a stratified random sample that consisted of 41,556 tenth graders from all over Jordan. The test results were saved in a data bank. A random sample of 150 participants' records was selected from this data bank. To address the purpose of this study, a series of analyses were conducted. WINSTEPS and RUMM programs were used for the analysis. The procedures that were used in this paper might be used by worldwide testing agencies to clarify or outline how Rasch measurement may be used to obtain evidence for the validity of inferences of tests data.  相似文献   

17.
18.
The authors report on the development of a brief dyslexia screening measure based on revising the 65-item Hong Kong Behaviour Checklist of Specific Learning Difficulties in Reading and Writing. Teachers’ ratings of 1063 primary students aged 6–14 years on the behaviour checklist provided data for its psychometric evaluation using traditional measurement and Rasch measurement model analyses. Rasch scaling suggested that the revised 36-item checklist could be regarded as a unidimensional scale that assesses global dyslexic dysfunction, and receiver operating characteristics analysis suggested that a score of 18 could be an optimal cut-off score when it is used as a dyslexia screening measure. The validity of this revised checklist was supported by its substantial and significant correlations with external measures of literacy and cognitive skills. Implications of the findings for the use of adaptive testing to provide an effective procedure for screening are discussed.  相似文献   

19.
This study examined the underlying structure of the Depression scale of the revised Minnesota Multiphasic Personality Inventory using dichotomous Rasch model and factor analysis. Rasch methodology was used to identify and restructure the Depression scale, and factor analysis was used to confirm the structure established by the Rasch model. The item calibration and factor analysis were carried out on the full sample of 2,600 normative subjects. The results revealed that the Depression scale did not consist of one homogeneous set of items, even though the scale was developed to measure one dimension of depression. Rasch analysis, as well as factor analysis, recognized two distinct content‐homogeneous subscales, here labeled mental depression and physical depression. The Rasch methodology provided a basis for a better understanding of the underlying structure and furnished a useful solution to the scale refinement.  相似文献   

20.
Although it has been claimed that the Rasch model leads to a higher degree of objectivity in measurement than has been previously possible, this model has had little impact on test development. Population-invariant item and ability calibrations, together with the statistical equivalency of any two item subsets, are supposedly possible if the item pool has been calibrated by the Rasch model. Initial research has been encouraging, but the implications of underlying assumptions and operational computations in the Rasch model for trait theory have not been clear from previous work. The current paper presents an analysis of the conditions under which the claims of objectivity will be substantiated, with special emphasis on the nature of equivalent forms. It is concluded that the real advantages of the Rasch model will not be apparent until the technology of trait measurement becomes more sophisticated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号