首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
With the increasing use of international survey data especially in cross-cultural and multinational studies, establishing measurement invariance (MI) across a large number of groups in a study is essential. Testing MI over many groups is methodologically challenging, however. We identified 5 methods for MI testing across many groups (multiple group confirmatory factor analysis, multilevel confirmatory factor analysis, multilevel factor mixture modeling, Bayesian approximate MI testing, and alignment optimization) and explicated the similarities and differences of these approaches in terms of their conceptual models and statistical procedures. A Monte Carlo study was conducted to investigate the efficacy of the 5 methods in detecting measurement noninvariance across many groups using various fit criteria. Generally, the 5 methods showed reasonable performance in identifying the level of invariance if an appropriate fit criterion was used (e.g., Bayesian information criteron with multilevel factor mixture modeling). Finally, general guidelines in selecting an appropriate method are provided.  相似文献   

2.
Confirmatory factor analytic procedures are routinely implemented to provide evidence of measurement invariance. Current lines of research focus on the accuracy of common analytic steps used in confirmatory factor analysis for invariance testing. However, the few studies that have examined this procedure have done so with perfectly or near perfectly fitting models. In the present study, the authors examined procedures for detecting simulated test structure differences across groups under model misspecification conditions. In particular, they manipulated sample size, number of factors, number of indicators per factor, percentage of a lack of invariance, and model misspecification. Model misspecification was introduced at the factor loading level. They evaluated three criteria for detection of invariance, including the chi-square difference test, the difference in comparative fit index values, and the combination of the two. Results indicate that misspecification was associated with elevated Type I error rates in measurement invariance testing.  相似文献   

3.

The aim of the study is to investigate the measurement invariance of mathematics self-concept and self-efficacy across 40 countries that participated in the Programme for International Student Assessment (PISA) 2003 and 2012 cycles. The sample of the study consists of 271,760 students in PISA 2003 and 333,804 students in PISA 2012. Firstly, the traditional measurement invariance testing was applied in the multiple-group confirmatory factor analysis (MGCFA). Then, the alignment analyses were performed, allowing non-invariance to a minimum to estimate all of the parameters. Results from MGCFA indicate that mathematics self-concept and self-efficacy hold metric invariance across the 80 groups (cycle by country). The alignment method results suggest that a large proportion of non-invariance exists in both mathematics self-concept and self-efficacy factors, and the factor means cannot be compared across all participating countries. Results of the Monte Carlo simulation show that the alignment results are trustworthy. Implications and limitations are discussed, and some recommendations for future research are proposed.

  相似文献   

4.
In latent growth modeling, measurement invariance across groups has received little attention. Considering that a group difference is commonly of interest in social science, a Monte Carlo study explored the performance of multigroup second-order latent growth modeling (MSLGM) in testing measurement invariance. True positive and false positive rates in detecting noninvariance across groups in addition to bias estimates of major MSLGM parameters were investigated. Simulation results support the suitability of MSLGM for measurement invariance testing when either forward or iterative likelihood ratio procedure is applied.  相似文献   

5.
Several structural equation modeling (SEM) strategies were developed for assessing measurement invariance (MI) across groups relaxing the assumptions of strict MI to partial, approximate, and partial approximate MI. Nonetheless, applied researchers still do not know if and under what conditions these strategies might provide results that allow for valid comparisons across groups in large-scale comparative surveys. We perform a comprehensive Monte Carlo simulation study to assess the conditions under which various SEM methods are appropriate to estimate latent means and path coefficients and their differences across groups. We find that while SEM path coefficients are relatively robust to violations of full MI and can be rather effectively recovered, recovering latent means and their group rankings might be difficult. Our results suggest that, contrary to some previous recommendations, partial invariance may rather effectively recover both path coefficients and latent means even when the majority of items are noninvariant. Although it is more difficult to recover latent means using approximate and partial approximate MI methods, it is possible under specific conditions and using appropriate models. These models also have the advantage of providing accurate standard errors. Alignment is recommended for recovering latent means in cases where there are only a few noninvariant parameters across groups.  相似文献   

6.
As a prerequisite for meaningful comparison of latent variables across multiple populations, measurement invariance or specifically factorial invariance has often been evaluated in social science research. Alongside with the changes in the model chi-square values, the comparative fit index (CFI; Bentler, 1990) is a widely used fit index for evaluating different stages of factorial invariance, including metric invariance (equal factor loadings), scalar invariance (equal intercepts), and strict invariance (equal unique factor variances). Although previous literature generally showed that the CFI performed well for single-group structural equation modeling analyses, its applicability to multiple group analyses such as factorial invariance studies has not been examined. In this study we argue that the commonly used default baseline model for the CFI might not be suitable for factorial invariance studies because (a) it is not nested within the scalar invariance model, and thus (b) the resulting CFI values might not be sensitive to the group differences in the measurement model. We therefore proposed a modified version of the CFI with an alternative (and less restrictive) baseline model that allows observed variables to be correlated. Monte Carlo simulation studies were conducted to evaluate the utility of this modified CFI across various conditions including varying degree of noninvariance and different factorial invariance models. Results showed that the modified CFI outperformed both the conventional CFI and the ΔCFI (Cheung & Rensvold, 2002) in terms of sensitivity to small and medium noninvariance.  相似文献   

7.
In testing the factorial invariance of a measure across groups, the groups are often of different sizes. Large imbalances in group size might affect the results of factorial invariance studies and lead to incorrect conclusions of invariance because the fit function in multiple-group factor analysis includes a weighting by group sample size. The implication is that violations of invariance might not be detected if the sample sizes of the 2 groups are severely unbalanced. In this study, we examined the effects of group size differences on results of factorial invariance tests, proposed a subsampling method to address unbalanced sample size issue in factorial invariance studies, and evaluated the proposed approach in various simulation conditions. Our findings confirm that violations of invariance might be masked in the case of severely unbalanced group size conditions and support the use of the proposed subsampling method to obtain accurate results for invariance studies.  相似文献   

8.
As low-stakes testing contexts increase, low test-taking effort may serve as a serious validity threat. One common solution to this problem is to identify noneffortful responses and treat them as missing during parameter estimation via the effort-moderated item response theory (EM-IRT) model. Although this model has been shown to outperform traditional IRT models (e.g., two-parameter logistic [2PL]) in parameter estimation under simulated conditions, prior research has failed to examine its performance under violations to the model’s assumptions. Therefore, the objective of this simulation study was to examine item and mean ability parameter recovery when violating the assumptions that noneffortful responding occurs randomly (Assumption 1) and is unrelated to the underlying ability of examinees (Assumption 2). Results demonstrated that, across conditions, the EM-IRT model provided robust item parameter estimates to violations of Assumption 1. However, bias values greater than 0.20 SDs were observed for the EM-IRT model when violating Assumption 2; nonetheless, these values were still lower than the 2PL model. In terms of mean ability estimates, model results indicated equal performance between the EM-IRT and 2PL models across conditions. Across both models, mean ability estimates were found to be biased by more than 0.25 SDs when violating Assumption 2. However, our accompanying empirical study suggested that this biasing occurred under extreme conditions that may not be present in some operational settings. Overall, these results suggest that the EM-IRT model provides superior item and equal mean ability parameter estimates in the presence of model violations under realistic conditions when compared with the 2PL model.  相似文献   

9.
A paucity of research has compared estimation methods within a measurement invariance (MI) framework and determined if research conclusions using normal-theory maximum likelihood (ML) generalizes to the robust ML (MLR) and weighted least squares means and variance adjusted (WLSMV) estimators. Using ordered categorical data, this simulation study aimed to address these queries by investigating 342 conditions. When testing for metric and scalar invariance, Δχ2 results revealed that Type I error rates varied across estimators (ML, MLR, and WLSMV) with symmetric and asymmetric data. The Δχ2 power varied substantially based on the estimator selected, type of noninvariant indicator, number of noninvariant indicators, and sample size. Although some the changes in approximate fit indexes (ΔAFI) are relatively sample size independent, researchers who use the ΔAFI with WLSMV should use caution, as these statistics do not perform well with misspecified models. As a supplemental analysis, our results evaluate and suggest cutoff values based on previous research.  相似文献   

10.
Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance   总被引:1,自引:0,他引:1  
Measurement invariance is usually tested using Multigroup Confirmatory Factor Analysis, which examines the change in the goodness-of-fit index (GFI) when cross-group constraints are imposed on a measurement model. Although many studies have examined the properties of GFI as indicators of overall model fit for single-group data, there have been none to date that examine how GFIs change when between-group constraints are added to a measurement model. The lack of a consensus about what constitutes significant GFI differences places limits on measurement invariance testing. We examine 20 GFIs based on the minimum fit function. A simulation under the two-group situation was used to examine changes in the GFIs (ΔGFIs) when invariance constraints were added. Based on the results, we recommend using Δcomparative fit index, ΔGamma hat, and ΔMcDonald's Noncentrality Index to evaluate measurement invariance. These three ΔGFIs are independent of both model complexity and sample size, and are not correlated with the overall fit measures. We propose critical values of these ΔGFIs that indicate measurement invariance.  相似文献   

11.
Socioeconomic status (SES) is often used as control variable when relations between academic outcomes and students' migrational background are investigated. When measuring SES, indicators used must have the same meaning across groups. This study aims to examine the measurement invariance of SES, using data from TIMSS, 2003. The study shows that a latent SES variable has the same meaning across sub-populations with Swedish and non-Swedish background. However, the assumption of scalar invariance was rejected, which is essential for estimation of differences in latent means between groups. Comparisons between models assuming different degrees of scalar invariance indicated that models allowing partial scalar invariance should not be used when comparing latent variable means across groups of students with different migrational backgrounds.  相似文献   

12.
This article presents a new method for multiple-group confirmatory factor analysis (CFA), referred to as the alignment method. The alignment method can be used to estimate group-specific factor means and variances without requiring exact measurement invariance. A strength of the method is the ability to conveniently estimate models for many groups. The method is a valuable alternative to the currently used multiple-group CFA methods for studying measurement invariance that require multiple manual model adjustments guided by modification indexes. Multiple-group CFA is not practical with many groups due to poor model fit of the scalar model and too many large modification indexes. In contrast, the alignment method is based on the configural model and essentially automates and greatly simplifies measurement invariance analysis. The method also provides a detailed account of parameter invariance for every model parameter in every group.  相似文献   

13.
To date, no effective empirical method has been available to identify a truly invariant reference variable (RV) in testing measurement invariance under a multiple-group confirmatory factor analysis. This study proposes a method that, in selecting an RV, uses the smallest modification index (min-mod). The method’s performance is evaluated using 2 models: (a) a full invariance model, and (b) a partial invariance model. Results indicate that for both models the min-mod successfully identifies a truly invariant RV (Study 1). In Study 2, we use the RV found in Study 1 to further evaluate the performance of item-by-item Wald tests at locating a noninvariant variable. The results indicate that Wald tests overall performed better with an RV selected in a partial invariance model than an RV selected in a full invariance model, although in certain conditions their performances were rather similar. Implications and limitations of the study are also discussed.  相似文献   

14.
We present a test for cluster bias, which can be used to detect violations of measurement invariance across clusters in 2-level data. We show how measurement invariance assumptions across clusters imply measurement invariance across levels in a 2-level factor model. Cluster bias is investigated by testing whether the within-level factor loadings are equal to the between-level factor loadings, and whether the between-level residual variances are zero. The test is illustrated with an example from school research. In a simulation study, we show that the cluster bias test has sufficient power, and the proportions of false positives are close to the chosen levels of significance.  相似文献   

15.

Research related to the “teacher characteristics” dimension of teacher quality has proven inconclusive and weakly related to student success, and addressing the teaching contexts may be crucial for furthering this line of inquiry. International large-scale assessments are well positioned to undertake such questions due to their systematic sampling of students, schools, and education systems. However, researchers are frequently prohibited from answering such questions due to measurement invariance related issues. This study uses the traditional multiple group confirmatory factor analysis (MGCFA) and an alignment optimization method to examine measurement invariance in several constructs from the teacher questionnaires in the Trends in International Mathematics and Science Study (TIMSS) 2015 across 46 education systems. Constructs included mathematics teacher’s Job satisfaction, School emphasis on academic success, School condition and resources, Safe and orderly school, and teacher’s Self-efficacy. The MGCFA results show that just three constructs achieve invariance at the metric level. However, an alignment optimization method is applied, and results show that all five constructs fall within the threshold of acceptable measurement non-invariance. This study therefore presents an argument that they can be validly compared across education systems, and a subsequent comparison of latent factor means compares differences across the groups. Future research may utilize the estimated factor means from the aligned models in order to further investigate the role of teacher characteristics and contexts in student outcomes.

  相似文献   

16.
Models of change typically assume longitudinal measurement invariance. Key constructs are often measured by ordered-categorical indicators (e.g., Likert scale items). If tests based on such indicators do not support longitudinal measurement invariance, it would be useful to gauge the practical significance of the detected non-invariance. The authors focus on the commonly used second-order latent growth curve model, proposing a sensitivity analysis that compares the growth parameter estimates from a model assuming the highest achieved level of measurement invariance to those from a model assuming a higher, incorrect level of measurement invariance as a measure of practical significance. A simulation study investigated the practical significance of non-invariance in different locations (loadings, thresholds, uniquenesses) in second-order latent linear growth models. The mean linear slope was affected by non-invariance in the loadings and thresholds, the intercept variance was affected by non-invariance in the uniquenesses, and the linear slope variance and intercept–slope covariance were affected by non-invariance in all three locations.  相似文献   

17.
Abstract

Recently, researchers have used multilevel models for estimating intervention effects in single-case experiments that include replications across participants (e.g., multiple baseline designs) or for combining results across multiple single-case studies. Researchers estimating these multilevel models have primarily relied on restricted maximum likelihood (REML) techniques, but Bayesian approaches have also been suggested. The purpose of this Monte Carlo simulation study was to examine the impact of estimation method (REML versus Bayesian with noninformative priors) on the estimation of treatment effects (relative bias, root mean square error) and on the inferences about those effects (interval coverage) for autocorrelated multiple-baseline data. Simulated conditions varied with regard to the number of participants, series length, and distribution of the variance within and across participants. REML and Bayesian estimation led to estimates of the fixed effects that showed little to no bias but that differentially impacted the inferences about the fixed effects and the estimates of the variances. Implications for applied researchers and methodologists are discussed.  相似文献   

18.
We illustrate testing measurement invariance in a second-order factor model using a quality of life dataset (n = 924). Measurement invariance was tested across 2 groups at a set of hierarchically structured levels: (a) configural invariance, (b) first-order factor loadings, (c) second-order factor loadings, (d) intercepts of measured variables, (e) intercepts of first-order factors, (f) disturbances of first-order factors, and (g) residual variances of observed variables. Given that measurement invariance at the factor loading and intercept levels was achieved, the latent factor mean difference on the higher order factor between the groups was also estimated. The analyses were performed on the mean and covariance structures within the framework of the confirmatory factor analysis using the LISREL 8.51 program. Implications of second-order factor models and measurement invariance in psychological research were discussed.  相似文献   

19.
The study aims to investigate the effects of delivery modalities on psychometric characteristics and student performance on cognitive tests. A first study assessed the inductive reasoning ability of 715 students under the supervision of teachers. A second study examined 731 students’ performance on the application of the control-of-variables strategy in basic physics but without teacher supervision due to the COVID-19 pandemic. Rasch measurement showed that the online format fitted to the data better in the unidimensional model across two conditions. Under teacher supervision, paper-based testing was better than online testing in terms of reliability and total scores, but contradictory findings were found in turn without teacher supervision. Although measurement invariance was confirmed between two versions at item level, the differential bundle functioning analysis supported the online groups on the item bundles constructed of figure-related materials. Response time was also discussed as an advantage of technology-based assessment for test development.  相似文献   

20.
The objective was to offer guidelines for applied researchers on how to weigh the consequences of errors made in evaluating measurement invariance (MI) on the assessment of factor mean differences. We conducted a simulation study to supplement the MI literature by focusing on choosing among analysis models with different number of between-group constraints imposed on loadings and intercepts of indicators. Data were generated with varying proportions, patterns, and magnitudes of differences in loadings and intercepts as well as factor mean differences and sample size. Based on the findings, we concluded that researchers who conduct MI analyses should recognize that relaxing as well as imposing constraints can affect Type I error rate, power, and bias of estimates in factor mean differences. In addition, fit indexes can be misleading in making decisions about constraints of loadings and intercepts. We offer suggestions for making MI decisions under uncertainty when assessing factor mean differences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号