首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Studies investigating invariance have often been limited to measurement or prediction invariance. Selection invariance, wherein the use of test scores for classification results in equivalent classification accuracy between groups, has received comparatively little attention in the psychometric literature. Previous research suggests that some form of selection bias (lack of selection invariance) will exist in most testing contexts, where classification decisions are made, even when meeting the conditions of measurement invariance. We define this conflict between measurement and selection invariance as the invariance paradox. Previous research has found test reliability to be an important factor in minimizing selection bias. This study demonstrates that the location of maximum test information may be a more important factor than overall test reliability in minimizing decision errors between groups.  相似文献   

2.
The computerization of reading assessments has presented a set of new challenges to test designers. From the vantage point of measurement invariance, test designers must investigate whether the traditionally recognized causes for violating invariance are still a concern in computer-mediated assessments. In addition, it is necessary to understand the technology-related causes of measurement invariance among test-taking populations. In this study, we used the available data (n = 800) from the previous administrations of the Pearson Test of English Academic (PTE Academic) reading, an international test of English comprising 10 test items, to investigate measurement invariance across gender and the Information and Communication Technology Development index (IDI). We conducted a multi-group confirmatory factor analysis (CFA) to assess invariance at four levels: configural, metric, scalar, and structural. Overall, we were able to confirm structural invariance for the PTE Academic, which is a necessary condition for conducting fair assessments. Implications for computer-based education and the assessment of reading are discussed.  相似文献   

3.
We present a test for cluster bias, which can be used to detect violations of measurement invariance across clusters in 2-level data. We show how measurement invariance assumptions across clusters imply measurement invariance across levels in a 2-level factor model. Cluster bias is investigated by testing whether the within-level factor loadings are equal to the between-level factor loadings, and whether the between-level residual variances are zero. The test is illustrated with an example from school research. In a simulation study, we show that the cluster bias test has sufficient power, and the proportions of false positives are close to the chosen levels of significance.  相似文献   

4.

Research related to the “teacher characteristics” dimension of teacher quality has proven inconclusive and weakly related to student success, and addressing the teaching contexts may be crucial for furthering this line of inquiry. International large-scale assessments are well positioned to undertake such questions due to their systematic sampling of students, schools, and education systems. However, researchers are frequently prohibited from answering such questions due to measurement invariance related issues. This study uses the traditional multiple group confirmatory factor analysis (MGCFA) and an alignment optimization method to examine measurement invariance in several constructs from the teacher questionnaires in the Trends in International Mathematics and Science Study (TIMSS) 2015 across 46 education systems. Constructs included mathematics teacher’s Job satisfaction, School emphasis on academic success, School condition and resources, Safe and orderly school, and teacher’s Self-efficacy. The MGCFA results show that just three constructs achieve invariance at the metric level. However, an alignment optimization method is applied, and results show that all five constructs fall within the threshold of acceptable measurement non-invariance. This study therefore presents an argument that they can be validly compared across education systems, and a subsequent comparison of latent factor means compares differences across the groups. Future research may utilize the estimated factor means from the aligned models in order to further investigate the role of teacher characteristics and contexts in student outcomes.

  相似文献   

5.
Multigroup confirmatory factor analysis (MCFA) is a popular method for the examination of measurement invariance and specifically, factor invariance. Recent research has begun to focus on using MCFA to detect invariance for test items. MCFA requires certain parameters (e.g., factor loadings) to be constrained for model identification, which are assumed to be invariant across groups, and act as referent variables. When this invariance assumption is violated, location of the parameters that actually differ across groups becomes difficult. The factor ratio test and the stepwise partitioning procedure in combination have been suggested as methods to locate invariant referents, and appear to perform favorably with real data examples. However, the procedures have not been evaluated through simulations where the extent and magnitude of a lack of invariance is known. This simulation study examines these methods in terms of accuracy (i.e., true positive and false positive rates) of identifying invariant referent variables.  相似文献   

6.
Measurement invariance with respect to groups is an essential aspect of the fair use of scores of intelligence tests and other psychological measurements. It is widely believed that equal factor loadings are sufficient to establish measurement invariance in confirmatory factor analysis. Here, it is shown why establishing measurement invariance with confirmatory factor analysis requires a statistical test of the equality over groups of measurement intercepts. Without this essential test, measurement bias may be overlooked. A re-analysis of a study by Te Nijenhuis, Tolboom, Resing, and Bleichrodt (2004) on ethnic differences on the RAKIT IQ test illustrates that ignoring intercept differences may lead to the conclusion that bias of IQ tests with respect to minorities is small, while in reality bias is quite severe.  相似文献   

7.
Comparing self-perceived quality of teaching to students’ perception can be used in higher education to improve the quality of teaching of pre-service teachers in teacher education. However, comparing these measurements from different perspectives is only meaningful if the same constructs are being measured. To shed light on this comparison’s meaningfulness, we scrutinised whether aspects of quality of teaching are measured in the same way across pre-service teachers and their students by means of measurement invariance analyses. To do so, 272 pre-service teachers in teacher education rated aspects of their quality of teaching, and were rated by their 4851 students. Measurement invariance across these perspectives was tested in multilevel structural equation models. Strong measurement invariance held for two aspects of quality of teaching; for the third, one item lacked weak measurement invariance. Pre-service teachers perceived their quality of teaching lower than their students. In conclusion, aspects of quality of teaching can be compared across perspectives, and teacher education should encourage pre-service teachers to use students’ feedback as a valuable resource for improving their quality of teaching.  相似文献   

8.
Factor mixture modeling (FMM) has been increasingly used to investigate unobserved population heterogeneity. This study examined the issue of covariate effects with FMM in the context of measurement invariance testing. Specifically, the impact of excluding and misspecifying covariate effects on measurement invariance testing and class enumeration was investigated via Monte Carlo simulations. Data were generated based on FMM models with (1) a zero covariate effect, (2) a covariate effect on the latent class variable, and (3) covariate effects on both the latent class variable and the factor. For each population model, different analysis models that excluded or misspecified covariate effects were fitted. Results highlighted the importance of including proper covariates in measurement invariance testing and evidenced the utility of a model comparison approach in searching for the correct specification of covariate effects and the level of measurement invariance. This approach was demonstrated using an empirical data set. Implications for methodological and applied research are discussed.  相似文献   

9.
This article compares the invariance properties of two methods of psychometric instrument calibration for the development of a measure of wealth among families of Grade 5 pupils in five provinces in Vietnam. The measure is based on self-reported lists of possessions in the home. Its stability has been measured over two time periods. The concept of fundamental measurement, and the properties of construct and measurement invariance have been outlined. Item response modelling (IRM) and confirmatory factor modelling (CFM) as comparative methodologies, and the processes used for evaluating these, have been discussed. Each procedure was used to calibrate a 23-item instrument with data collected from a probability sample of Grade 5 pupils in a total of 60 schools. The two procedures were compared on the basis of their capacity to provide evidence of construct and measurement invariance, stability of parameter estimates, bias for or against sub samples, and the simplicity of the procedures and their interpretive powers. Both provided convincing evidence of construct invariance, but only the Rasch procedure was able to provide firm evidence of measurement invariance, parameter stability and a lack of bias across samples.  相似文献   

10.
We illustrate testing measurement invariance in a second-order factor model using a quality of life dataset (n = 924). Measurement invariance was tested across 2 groups at a set of hierarchically structured levels: (a) configural invariance, (b) first-order factor loadings, (c) second-order factor loadings, (d) intercepts of measured variables, (e) intercepts of first-order factors, (f) disturbances of first-order factors, and (g) residual variances of observed variables. Given that measurement invariance at the factor loading and intercept levels was achieved, the latent factor mean difference on the higher order factor between the groups was also estimated. The analyses were performed on the mean and covariance structures within the framework of the confirmatory factor analysis using the LISREL 8.51 program. Implications of second-order factor models and measurement invariance in psychological research were discussed.  相似文献   

11.
As a prerequisite for meaningful comparison of latent variables across multiple populations, measurement invariance or specifically factorial invariance has often been evaluated in social science research. Alongside with the changes in the model chi-square values, the comparative fit index (CFI; Bentler, 1990) is a widely used fit index for evaluating different stages of factorial invariance, including metric invariance (equal factor loadings), scalar invariance (equal intercepts), and strict invariance (equal unique factor variances). Although previous literature generally showed that the CFI performed well for single-group structural equation modeling analyses, its applicability to multiple group analyses such as factorial invariance studies has not been examined. In this study we argue that the commonly used default baseline model for the CFI might not be suitable for factorial invariance studies because (a) it is not nested within the scalar invariance model, and thus (b) the resulting CFI values might not be sensitive to the group differences in the measurement model. We therefore proposed a modified version of the CFI with an alternative (and less restrictive) baseline model that allows observed variables to be correlated. Monte Carlo simulation studies were conducted to evaluate the utility of this modified CFI across various conditions including varying degree of noninvariance and different factorial invariance models. Results showed that the modified CFI outperformed both the conventional CFI and the ΔCFI (Cheung & Rensvold, 2002) in terms of sensitivity to small and medium noninvariance.  相似文献   

12.

The aim of the study is to investigate the measurement invariance of mathematics self-concept and self-efficacy across 40 countries that participated in the Programme for International Student Assessment (PISA) 2003 and 2012 cycles. The sample of the study consists of 271,760 students in PISA 2003 and 333,804 students in PISA 2012. Firstly, the traditional measurement invariance testing was applied in the multiple-group confirmatory factor analysis (MGCFA). Then, the alignment analyses were performed, allowing non-invariance to a minimum to estimate all of the parameters. Results from MGCFA indicate that mathematics self-concept and self-efficacy hold metric invariance across the 80 groups (cycle by country). The alignment method results suggest that a large proportion of non-invariance exists in both mathematics self-concept and self-efficacy factors, and the factor means cannot be compared across all participating countries. Results of the Monte Carlo simulation show that the alignment results are trustworthy. Implications and limitations are discussed, and some recommendations for future research are proposed.

  相似文献   

13.
Confirmatory factor analytic procedures are routinely implemented to provide evidence of measurement invariance. Current lines of research focus on the accuracy of common analytic steps used in confirmatory factor analysis for invariance testing. However, the few studies that have examined this procedure have done so with perfectly or near perfectly fitting models. In the present study, the authors examined procedures for detecting simulated test structure differences across groups under model misspecification conditions. In particular, they manipulated sample size, number of factors, number of indicators per factor, percentage of a lack of invariance, and model misspecification. Model misspecification was introduced at the factor loading level. They evaluated three criteria for detection of invariance, including the chi-square difference test, the difference in comparative fit index values, and the combination of the two. Results indicate that misspecification was associated with elevated Type I error rates in measurement invariance testing.  相似文献   

14.
Factorial invariance assessment is central in the development of educational and psychological assessments. Establishing invariance of factor structures is key for building a strong score and inference validity argument and assists in establishing the fairness of score use. Fit indices and guidelines for judging a lack of invariance is an ongoing line of research. In this study, the authors examined the performance of the root mean squared error of approximation equivalence testing approach described by Yuan and Chan in the context of measurement invariance assessment. This investigation was completed through a simulation study in which several factors were varied, including sample size, type of invariance tested, and magnitude and percent of a lack of invariance. The findings generally support the use of equivalence testing for situations in which the indicator variables were normally distributed, particularly for total sample sizes of 200 or more.  相似文献   

15.
With the increasing use of international survey data especially in cross-cultural and multinational studies, establishing measurement invariance (MI) across a large number of groups in a study is essential. Testing MI over many groups is methodologically challenging, however. We identified 5 methods for MI testing across many groups (multiple group confirmatory factor analysis, multilevel confirmatory factor analysis, multilevel factor mixture modeling, Bayesian approximate MI testing, and alignment optimization) and explicated the similarities and differences of these approaches in terms of their conceptual models and statistical procedures. A Monte Carlo study was conducted to investigate the efficacy of the 5 methods in detecting measurement noninvariance across many groups using various fit criteria. Generally, the 5 methods showed reasonable performance in identifying the level of invariance if an appropriate fit criterion was used (e.g., Bayesian information criteron with multilevel factor mixture modeling). Finally, general guidelines in selecting an appropriate method are provided.  相似文献   

16.
A challenge using the Dynamic Indicators of Basic Early Literacy Skills (DIBELS) in studying reading growth is that reading skills children exhibit change by age. In order to study growth using changing subscales, it is necessary to examine measurement invariance and measurement structure underlying the different subscales. The purpose of this paper is to examine the measurement structure of the DIBELS subscales, particular measurement invariance. The results indicate that the DIBELS subscales do not seem to have metric invariance but they do share a common factor over time, suggesting that the same construct of reading skills were measured but they manifested in the different fashion over time.  相似文献   

17.
Measurement invariance of the five-factor Servant Leadership Questionnaire between female and male K-12 principals was tested using multi-group confirmatory factor analysis. A sample of 956 principals (56.9% were females and 43.1% were males) was analysed in this study. The hierarchical multi-step measurement invariance test supported the measurement invariance of the five-factor model across gender. Latent factor means were compared between females and males when measurement invariance was established. Results showed that females were significantly higher than males on emotional healing, wisdom, persuasive mapping and organisational stewardship, and they were not statistically different on altruistic calling.  相似文献   

18.
To date, no effective empirical method has been available to identify a truly invariant reference variable (RV) in testing measurement invariance under a multiple-group confirmatory factor analysis. This study proposes a method that, in selecting an RV, uses the smallest modification index (min-mod). The method’s performance is evaluated using 2 models: (a) a full invariance model, and (b) a partial invariance model. Results indicate that for both models the min-mod successfully identifies a truly invariant RV (Study 1). In Study 2, we use the RV found in Study 1 to further evaluate the performance of item-by-item Wald tests at locating a noninvariant variable. The results indicate that Wald tests overall performed better with an RV selected in a partial invariance model than an RV selected in a full invariance model, although in certain conditions their performances were rather similar. Implications and limitations of the study are also discussed.  相似文献   

19.
The purpose of this article was twofold. The first purpose was to test the validity of the Teachers’ Sense of Self-Efficacy Scale (TSES) in five settings—Canada, Cyprus, Korea, Singapore, and the United States. The second purpose was, by extension, to establish the importance of the teacher self-efficacy construct across diverse teaching conditions. Multi-group confirmatory factor analysis was used to better understand the measurement invariance of the scale across countries, after which the relationship between the TSES, its three factors, and job satisfaction was explored. The TSES showed convincing evidence of reliability and measurement invariance across the five countries, and the relationship between the TSES and job satisfaction was similar across settings. The study provides general evidence that teachers’ self-efficacy is a valid construct across culturally diverse settings and specific evidence that teachers’ self-efficacy showed a similar relationship with teachers’ job satisfaction in five contrasting settings.  相似文献   

20.
Confirmatory factor analytic tests of measurement invariance (MI) require a referent indicator (RI) for model identification. Although the assumption that the RI is perfectly invariant across groups is acknowledged as problematic, the literature provides relatively little guidance for researchers to identify the conditions under which the practice is appropriate. Using simulated data, this study examined the effects of RI selection on both scale- and item-level MI tests. Results indicated that while inappropriate RI selection has little effect on the accuracy of conclusions drawn from scale-level tests of metric invariance, poor RI choice can produce very misleading results for item-level tests. As a result, group comparisons under conditions of partial invariance are highly susceptible to problems associated with poor RI choice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号