首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We illustrate testing measurement invariance in a second-order factor model using a quality of life dataset (n = 924). Measurement invariance was tested across 2 groups at a set of hierarchically structured levels: (a) configural invariance, (b) first-order factor loadings, (c) second-order factor loadings, (d) intercepts of measured variables, (e) intercepts of first-order factors, (f) disturbances of first-order factors, and (g) residual variances of observed variables. Given that measurement invariance at the factor loading and intercept levels was achieved, the latent factor mean difference on the higher order factor between the groups was also estimated. The analyses were performed on the mean and covariance structures within the framework of the confirmatory factor analysis using the LISREL 8.51 program. Implications of second-order factor models and measurement invariance in psychological research were discussed.  相似文献   

2.
This study investigated the factorial invariance of scores from a 7th-grade state reading assessment across general education students and selected groups of students with disabilities. Confirmatory factor analysis was used to assess the fit of a 2-factor model to each of the 4 groups. In addition to overall fit of this model, 5 levels of constraint, including equal factor loadings, intercepts, error variances, factor variances, and factor covariances, were investigated. Invariance across the factor loadings and intercepts was supported across the groups of students with disabilities and general education students. Invariance for these groups was not supported for the error variances. For the students with mental retardation, the lack of fit of the 2-factor model and the observed score results suggested a mismatch between the difficulty level of this test and the ability level of these students. Although the results generally supported the score comparability of the reading assessment across these groups, further research is needed into the nature of the larger error variances for the student with disabilities groups and into accommodations and modifications for the students with mental retardation.  相似文献   

3.
The objective was to offer guidelines for applied researchers on how to weigh the consequences of errors made in evaluating measurement invariance (MI) on the assessment of factor mean differences. We conducted a simulation study to supplement the MI literature by focusing on choosing among analysis models with different number of between-group constraints imposed on loadings and intercepts of indicators. Data were generated with varying proportions, patterns, and magnitudes of differences in loadings and intercepts as well as factor mean differences and sample size. Based on the findings, we concluded that researchers who conduct MI analyses should recognize that relaxing as well as imposing constraints can affect Type I error rate, power, and bias of estimates in factor mean differences. In addition, fit indexes can be misleading in making decisions about constraints of loadings and intercepts. We offer suggestions for making MI decisions under uncertainty when assessing factor mean differences.  相似文献   

4.
This study examined the performance of the weighted root mean square residual (WRMR) through a simulation study using confirmatory factor analysis with ordinal data. Values and cut scores for the WRMR were examined, along with a comparison of its performance relative to commonly cited fit indexes. The findings showed that WRMR illustrated worse fit when sample size increased or model misspecification increased. Lower (i.e., better) values of WRMR were observed when nonnormal data were present, there were lower loadings, and when few categories were analyzed. WRMR generally illustrated expected patterns of relations to other well-known fit indexes. In general, a cutoff value of 1.0 appeared to work adequately under the tested conditions and the WRMR values of “good fit” were generally in agreement with other indexes. Users are cautioned that when the fitted model is misspeficifed, the index might provide misleading results under situations where extremely large sample sizes are used.  相似文献   

5.
We present a test for cluster bias, which can be used to detect violations of measurement invariance across clusters in 2-level data. We show how measurement invariance assumptions across clusters imply measurement invariance across levels in a 2-level factor model. Cluster bias is investigated by testing whether the within-level factor loadings are equal to the between-level factor loadings, and whether the between-level residual variances are zero. The test is illustrated with an example from school research. In a simulation study, we show that the cluster bias test has sufficient power, and the proportions of false positives are close to the chosen levels of significance.  相似文献   

6.
As a prerequisite for meaningful comparison of latent variables across multiple populations, measurement invariance or specifically factorial invariance has often been evaluated in social science research. Alongside with the changes in the model chi-square values, the comparative fit index (CFI; Bentler, 1990) is a widely used fit index for evaluating different stages of factorial invariance, including metric invariance (equal factor loadings), scalar invariance (equal intercepts), and strict invariance (equal unique factor variances). Although previous literature generally showed that the CFI performed well for single-group structural equation modeling analyses, its applicability to multiple group analyses such as factorial invariance studies has not been examined. In this study we argue that the commonly used default baseline model for the CFI might not be suitable for factorial invariance studies because (a) it is not nested within the scalar invariance model, and thus (b) the resulting CFI values might not be sensitive to the group differences in the measurement model. We therefore proposed a modified version of the CFI with an alternative (and less restrictive) baseline model that allows observed variables to be correlated. Monte Carlo simulation studies were conducted to evaluate the utility of this modified CFI across various conditions including varying degree of noninvariance and different factorial invariance models. Results showed that the modified CFI outperformed both the conventional CFI and the ΔCFI (Cheung & Rensvold, 2002) in terms of sensitivity to small and medium noninvariance.  相似文献   

7.
This simulation study assesses the statistical performance of two mathematically equivalent parameterizations for multitrait–multimethod data with interchangeable raters—a multilevel confirmatory factor analysis (CFA) and a classical CFA parameterization. The sample sizes of targets and raters, the factorial structure of the trait factors, and rater missingness are varied. The classical CFA approach yields a high proportion of improper solutions under conditions with small sample sizes and indicator-specific trait factors. In general, trait factor related parameters are more sensitive to bias than other types of parameters. For multilevel CFAs, there is a drastic bias in fit statistics under conditions with unidimensional trait factors on the between level, where root mean square error of approximation (RMSEA) and χ2 distributions reveal a downward bias, whereas the between standardized root mean square residual is biased upwards. In contrast, RMSEA and χ2 for classical CFA models are severely upwardly biased in conditions with a high number of raters and a small number of targets.  相似文献   

8.
Contamination of responses due to extreme and midpoint response style can confound the interpretation of scores, threatening the validity of inferences made from survey responses. This study incorporated person-level covariates in the multidimensional item response tree model to explain heterogeneity in response style. We include an empirical example and two simulation studies to support the use and interpretation of the model: parameter recovery using Markov chain Monte Carlo (MCMC) estimation and performance of the model under conditions with and without response styles present. Item intercepts mean bias and root mean square error were small at all sample sizes. Item discrimination mean bias and root mean square error were also small but tended to be smaller when covariates were unrelated to, or had a weak relationship with, the latent traits. Item and regression parameters are estimated with sufficient accuracy when sample sizes are greater than approximately 1,000 and MCMC estimation with the Gibbs sampler is used. The empirical example uses the National Longitudinal Study of Adolescent to Adult Health’s sexual knowledge scale. Meaningful predictors associated with high levels of extreme response latent trait included being non-White, being male, and having high levels of parental support and relationships. Meaningful predictors associated with high levels of the midpoint response latent trait included having low levels of parental support and relationships. Item-level covariates indicate the response style pseudo-items were less easy to endorse for self-oriented items, whereas the trait of interest pseudo-items were easier to endorse for self-oriented items.  相似文献   

9.
Multigroup exploratory factor analysis (EFA) has gained popularity to address measurement invariance for two reasons. Firstly, repeatedly respecifying confirmatory factor analysis (CFA) models strongly capitalizes on chance and using EFA as a precursor works better. Secondly, the fixed zero loadings of CFA are often too restrictive. In multigroup EFA, factor loading invariance is rejected if the fit decreases significantly when fixing the loadings to be equal across groups. To locate the precise factor loading non-invariances by means of hypothesis testing, the factors’ rotational freedom needs to be resolved per group. In the literature, a solution exists for identifying optimal rotations for one group or invariant loadings across groups. Building on this, we present multigroup factor rotation (MGFR) for identifying loading non-invariances. Specifically, MGFR rotates group-specific loadings both to simple structure and between-group agreement, while disentangling loading differences from differences in the structural model (i.e., factor (co)variances).  相似文献   

10.
This study examined the measurement structure, cross-year stability of achievement goals, and mediating effects of achievement goals between self-efficacy and math grades in a national sample of Taiwan middle school students. The measurement model with factorial structure showed good fit to the data. In the panel data (N?=?343), four achievement goals showed strong measurement invariance, suggesting factor loadings and intercepts of the items remained invariant across a year. Though mean scores of the four latent achievement goals held quite stable, the rank order of students across two time-points changed more profoundly in the two avoidance goals than in the approached goals. In the cross-sectional data (N?=?748), we found approach-based goals were positive mediators between self-efficacy and math grades while avoidance-based goals were negative mediators. This result could be relevant for middle-school students in learning mathematics. Some instructional implications are provided.  相似文献   

11.
In previous research (Hu & Bentler, 1998, 1999), 2 conclusions were drawn: standardized root mean squared residual (SRMR) was the most sensitive to misspecified factor covariances, and a group of other fit indexes were most sensitive to misspecified factor loadings. Based on these findings, a 2-index strategy-that is, SRMR coupled with another index-was proposed in model fit assessment to detect potential misspecification in both the structural and measurement model parameters. Based on our reasoning and empirical work presented in this article, we conclude that SRMR is not necessarily most sensitive to misspecified factor covariances (structural model misspecification), the group of indexes (TLI, BL89, RNI, CFI, Gamma hat, Mc, or RMSEA) are not necessarily more sensitive to misspecified factor loadings (measurement model misspecification), and the rationale for the 2-index presentation strategy appears to have questionable validity.  相似文献   

12.
We compare the accuracy of confidence intervals (CIs) and tests of close fit based on the root mean square error of approximation (RMSEA) with those based on the standardized root mean square residual (SRMR). Investigations used normal and nonnormal data with models ranging from p = 10 to 60 observed variables. CIs and tests of close fit based on the SRMR are generally accurate across all conditions (even at p = 60 with nonnormal data). In contrast, CIs and tests of close fit based on the RMSEA are only accurate in small models. In larger models (p ≥ 30), they incorrectly suggest that models do not fit closely, particularly if sample size is less than 500.  相似文献   

13.
Bootstrapping approximate fit indexes in structural equation modeling (SEM) is of great importance because most fit indexes do not have tractable analytic distributions. Model-based bootstrap, which has been proposed to obtain the distribution of the model chi-square statistic under the null hypothesis (Bollen & Stine, 1992), is not theoretically appropriate for obtaining confidence intervals (CIs) for fit indexes because it assumes the null is exactly true. On the other hand, naive bootstrap is not expected to work well for those fit indexes that are based on the chi-square statistic, such as the root mean square error of approximation (RMSEA) and the comparative fit index (CFI), because sample noncentrality is a biased estimate of the population noncentrality. In this article we argue that a recently proposed bootstrap approach due to Yuan, Hayashi, and Yanagihara (YHY; 2007) is ideal for bootstrapping fit indexes that are based on the chi-square. This method transforms the data so that the “parent” population has the population noncentrality parameter equal to the estimated noncentrality in the original sample. We conducted a simulation study to evaluate the performance of the YHY bootstrap and the naive bootstrap for 4 indexes: RMSEA, CFI, goodness-of-fit index (GFI), and standardized root mean square residual (SRMR). We found that for RMSEA and CFI, the CIs under the YHY bootstrap had relatively good coverage rates for all conditions, whereas the CIs under the naive bootstrap had very low coverage rates when the fitted model had large degrees of freedom. However, for GFI and SRMR, the CIs under both bootstrap methods had poor coverage rates in most conditions.  相似文献   

14.
Typical confirmatory factor analysis studies of factorial invariance test parameter (factor loadings, factor variances/covariances, and uniquenesses) invariance across only two groups (e.g., males and females) or, perhaps, across more than two groups reflecting different levels of a single design facet (e.g., age). The present investigation extends this approach by considering invariance across groups from a two‐facet design. Data consist of multiple dimensions of self‐concept collected from eight groups of students (total N = 4,000) representing a 2 (Gender) × 4 (Age) design. The gender‐stereotypic model posits a particular pattern of gender differences in structure that varies with age. Adopting analysis‐of‐variance terminology, the model posits that structural differences will vary as a function of gender but that this gender effect interacts with age. In testing this model, I consider the lack of invariance in different sets of parameters attributable to gender, age, and their interaction.  相似文献   

15.
The Classroom Appraisal of Resources and Demands (CARD) was designed to evaluate teacher stress based on subjective evaluations of classroom demands and resources. However, the CARD has been mostly utilized in western countries. The aim of the current study was to provide aspects of the validity of responses to a Chinese version of the CARD that considers Chinese teachers’ unique vocational conditions in the classroom. A sample of 580 Chinese elementary school teachers (510 female teachers and 70 male teachers) were asked to respond to the Chinese version of the CARD. Confirmatory factor analyses showed that the data fit the theoretical model very well (e.g., CFI: .982; NFI: .977; GFI: .968; SRMR: .028; RMSEA: .075; where CFI is comparative fit index, NFI is normed fit index, GFI is goodness of fit, SRMR is standardized root mean square residual, RMSEA is root mean square error of approximation), thus providing evidence of construct validity. Latent constructs of the Chinese version of the CARD were also found to be significantly associated with other measures that are related to teacher stress such as self‐efficacy, job satisfaction, personal habits to deal with stress, and intention to leave their current job.  相似文献   

16.
Model fit indices are being increasingly recommended and used to select the number of factors in an exploratory factor analysis. Growing evidence suggests that the recommended cutoff values for common model fit indices are not appropriate for use in an exploratory factor analysis context. A particularly prominent problem in scale evaluation is the ubiquity of correlated residuals and imperfect model specification. Our research focuses on a scale evaluation context and the performance of four standard model fit indices: root mean square error of approximate (RMSEA), standardized root mean square residual (SRMR), comparative fit index (CFI), and Tucker–Lewis index (TLI), and two equivalence test-based model fit indices: RMSEAt and CFIt. We use Monte Carlo simulation to generate and analyze data based on a substantive example using the positive and negative affective schedule (N = 1,000). We systematically vary the number and magnitude of correlated residuals as well as nonspecific misspecification, to evaluate the impact on model fit indices in fitting a two-factor exploratory factor analysis. Our results show that all fit indices, except SRMR, are overly sensitive to correlated residuals and nonspecific error, resulting in solutions that are overfactored. SRMR performed well, consistently selecting the correct number of factors; however, previous research suggests it does not perform well with categorical data. In general, we do not recommend using model fit indices to select number of factors in a scale evaluation framework.  相似文献   

17.
We estimated the invariance of educational achievement (EA) and learning attitudes (LA) measures across nations. A multi-group confirmatory factor analysis was used to estimate the invariance of educational achievement and learning attitudes across 55 nations (Programme for International Student Assessment [PISA] 2006 data, N?=?354,203). The constructs had the same meaning (factor loadings) but different scales (intercepts). Our conclusion is that comparisons of the relationships between educational achievement and learning attitudes across countries need to take into consideration two sources of variability: individual differences of students and group differences of educational systems. The lack of scalar invariance in EA and LA measures means that the relationships between EA and LA may have a different meaning at the level of nations and at the student level within countries. In other words, as PISA measures are not invariant in scalar sense, the comparisons across countries with nationally aggregated scores are not justified.  相似文献   

18.
This study presents a new approach to synthesizing differential item functioning (DIF) effect size: First, using correlation matrices from each study, we perform a multigroup confirmatory factor analysis (MGCFA) that examines measurement invariance of a test item between two subgroups (i.e., focal and reference groups). Then we synthesize, across the studies, the differences in the estimated factor loadings between the two subgroups, resulting in a meta-analytic summary of the MGCFA effect sizes (MGCFA-ES). The performance of this new approach was examined using a Monte Carlo simulation, where we created 108 conditions by four factors: (1) three levels of item difficulty, (2) four magnitudes of DIF, (3) three levels of sample size, and (4) three types of correlation matrix (tetrachoric, adjusted Pearson, and Pearson). Results indicate that when MGCFA is fitted to tetrachoric correlation matrices, the meta-analytic summary of the MGCFA-ES performed best in terms of bias and mean square error values, 95% confidence interval coverages, empirical standard errors, Type I error rates, and statistical power; and reasonably well with adjusted Pearson correlation matrices. In addition, when tetrachoric correlation matrices are used, a meta-analytic summary of the MGCFA-ES performed well, particularly, under the condition that a high difficulty item with a large DIF was administered to a large sample size. Our result offers an option for synthesizing the magnitude of DIF on a flagged item across studies in practice.  相似文献   

19.
Factorial invariance assessment is central in the development of educational and psychological assessments. Establishing invariance of factor structures is key for building a strong score and inference validity argument and assists in establishing the fairness of score use. Fit indices and guidelines for judging a lack of invariance is an ongoing line of research. In this study, the authors examined the performance of the root mean squared error of approximation equivalence testing approach described by Yuan and Chan in the context of measurement invariance assessment. This investigation was completed through a simulation study in which several factors were varied, including sample size, type of invariance tested, and magnitude and percent of a lack of invariance. The findings generally support the use of equivalence testing for situations in which the indicator variables were normally distributed, particularly for total sample sizes of 200 or more.  相似文献   

20.
Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance   总被引:1,自引:0,他引:1  
Measurement invariance is usually tested using Multigroup Confirmatory Factor Analysis, which examines the change in the goodness-of-fit index (GFI) when cross-group constraints are imposed on a measurement model. Although many studies have examined the properties of GFI as indicators of overall model fit for single-group data, there have been none to date that examine how GFIs change when between-group constraints are added to a measurement model. The lack of a consensus about what constitutes significant GFI differences places limits on measurement invariance testing. We examine 20 GFIs based on the minimum fit function. A simulation under the two-group situation was used to examine changes in the GFIs (ΔGFIs) when invariance constraints were added. Based on the results, we recommend using Δcomparative fit index, ΔGamma hat, and ΔMcDonald's Noncentrality Index to evaluate measurement invariance. These three ΔGFIs are independent of both model complexity and sample size, and are not correlated with the overall fit measures. We propose critical values of these ΔGFIs that indicate measurement invariance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号