首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
An interval estimation procedure for proportion of explained observed variance in latent curve analysis is discussed, which can be used as an aid in the process of choosing between linear and nonlinear models. The method allows obtaining confidence intervals for the R 2 indexes associated with repeatedly followed measures in longitudinal studies. In addition to facilitating evaluation of local model fit, the approach is helpful for purposes of differentiating between plausible models stipulating different patterns of change over time, and in particular in empirical situations characterized by large samples and high statistical power. The procedure is also applicable in cross-sectional studies, as well as with general structural equation models. The method is illustrated using data from a nationally representative study of older adults.  相似文献   

2.
The purpose of this study is to apply the attribute hierarchy method (AHM) to a subset of SAT critical reading items and illustrate how the method can be used to promote cognitive diagnostic inferences. The AHM is a psychometric procedure for classifying examinees’ test item responses into a set of attribute mastery patterns associated with different components from a cognitive model. The study was conducted in two steps. In step 1, three cognitive models were developed by reviewing selected literature in reading comprehension as well as research related to SAT Critical Reading. Then, the cognitive models were validated by having a sample of students think aloud as they solved each item. In step 2, psychometric analyses were conducted on the SAT critical reading cognitive models by evaluating the model‐data fit between the expected and observed response patterns produced from two random samples of 2,000 examinees who wrote the items. The model that provided best data‐model fit was then used to calculate attribute probabilities for 15 examinees to illustrate our diagnostic testing procedure.  相似文献   

3.
Standard 3.9 of the Standards for Educational and Psychological Testing ( 1999 ) demands evidence of model fit when item response theory (IRT) models are employed to data from tests. Hambleton and Han ( 2005 ) and Sinharay ( 2005 ) recommended the assessment of practical significance of misfit of IRT models, but few examples of such assessment can be found in the literature concerning IRT model fit. In this article, practical significance of misfit of IRT models was assessed using data from several tests that employ IRT models to report scores. The IRT model did not fit any data set considered in this article. However, the extent of practical significance of misfit varied over the data sets.  相似文献   

4.
Posterior predictive model checking (PPMC) is a Bayesian model checking method that compares the observed data to (plausible) future observations from the posterior predictive distribution. We propose an alternative to PPMC in the context of structural equation modeling, which we term the poor person’s PPMC (PP-PPMC), for the situation wherein one cannot afford (or is unwilling) to draw samples from the full posterior. Using only by-products of likelihood-based estimation (maximum likelihood estimate and information matrix), the PP-PPMC offers a natural method to handle parameter uncertainty in model fit assessment. In particular, a coupling relationship between the classical p values from the model fit chi-square test and the predictive p values from the PP-PPMC method is carefully examined, suggesting that PP-PPMC might offer an alternative, principled approach for model fit assessment. We also illustrate the flexibility of the PP-PPMC approach by applying it to case-influence diagnostics.  相似文献   

5.
The idea that test scores may not be valid representations of what students know, can do, and should learn next is well known. Person fit provides an important aspect of validity evidence. Person fit analyses at the individual student level are not typically conducted and person fit information is not communicated to educational stakeholders. In this study, we focus on a promising method for detecting and conveying person fit for large-scale educational assessments. This method uses multilevel logistic regression (MLR) to model the slopes of the person response functions, a potential source of person misfit for IRT models. We apply the method to a representative sample of students who took the writing section of the SAT (N = 19,341). The findings suggest that the MLR approach is useful for providing supplemental evidence of model–data fit in large-scale educational test settings. MLR can be useful for detecting general misfit at global and individual levels. However, as with other model–data fit indices, the MLR approach is limited in providing information regarding only some types of person misfit.  相似文献   

6.
In a recent article, Castro-Schilo, Widaman, and Grimm (2013) compared different approaches for relating multitrait–multimethod (MTMM) data to external variables. Castro-Schilo et al. reported that estimated associations with external variables were in part biased when either the correlated traits–correlated uniqueness (CT-CU) or correlated traits–correlated (methods–1) [CT-C(M–1)] models were fit to data generated from the correlated traits–correlated methods (CT-CM) model, whereas the data-generating CT-CM model accurately reproduced these associations. Castro-Schilo et al. argued that the CT-CM model adequately represents the data-generating mechanism in MTMM studies, whereas the CT-CU and CT-C(M–1) models do not fully represent the MTMM structure. In this comment, we question whether the CT-CM model is more plausible as a data-generating model for MTMM data than the CT-C(M–1) model. We show that the CT-C(M–1) model can be formulated as a reparameterization of a basic MTMM true score model that leads to a meaningful and parsimonious representation of MTMM data. We advocate the use confirmatory factor analysis MTMM models in which latent trait, method, and error variables are explicitly and constructively defined based on psychometric theory.  相似文献   

7.
In psychological research, available data are often insufficient to estimate item factor analysis (IFA) models using traditional estimation methods, such as maximum likelihood (ML) or limited information estimators. Bayesian estimation with common-sense, moderately informative priors can greatly improve efficiency of parameter estimates and stabilize estimation. There are a variety of methods available to evaluate model fit in a Bayesian framework; however, past work investigating Bayesian model fit assessment for IFA models has assumed flat priors, which have no advantage over ML in limited data settings. In this paper, we evaluated the impact of moderately informative priors on ability to detect model misfit for several candidate indices: posterior predictive checks based on the observed score distribution, leave-one-out cross-validation, and widely available information criterion (WAIC). We found that although Bayesian estimation with moderately informative priors is an excellent aid for estimating challenging IFA models, methods for testing model fit in these circumstances are inadequate.  相似文献   

8.
9.
We compare the accuracy of confidence intervals (CIs) and tests of close fit based on the root mean square error of approximation (RMSEA) with those based on the standardized root mean square residual (SRMR). Investigations used normal and nonnormal data with models ranging from p = 10 to 60 observed variables. CIs and tests of close fit based on the SRMR are generally accurate across all conditions (even at p = 60 with nonnormal data). In contrast, CIs and tests of close fit based on the RMSEA are only accurate in small models. In larger models (p ≥ 30), they incorrectly suggest that models do not fit closely, particularly if sample size is less than 500.  相似文献   

10.
Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM‐based data with a linear attribute structure. The study utilizes a procedure to make the IRM and CDM frameworks comparable and investigates how estimation accuracy is affected by test diagnosticity and the match between the true and fitted models. The study shows that comparable results can be obtained when highly diagnostic IRM data are retrofitted with CDM, and vice versa, retrofitting CDMs to IRM‐based data in some conditions can result in considerable examinee misclassification, and model fit indices provide limited indication of the accuracy of item parameter estimation and attribute classification.  相似文献   

11.
The power of the chi-square test statistic used in structural equation modeling decreases as the absolute value of excess kurtosis of the observed data increases. Excess kurtosis is more likely the smaller the number of item response categories. As a result, fit is likely to improve as the number of item response categories decreases, regardless of the true underlying factor structure or χ2-based fit index used to examine model fit. Equivalently, given a target value of approximate fit (e.g., root mean square error of approximation ≤ .05) a model with more factors is needed to reach it as the number of categories increases. This is true regardless of whether the data are treated as continuous (common factor analysis) or as discrete (ordinal factor analysis). We recommend using a large number of response alternatives (≥ 5) to increase the power to detect incorrect substantive models.  相似文献   

12.
Assessing the correspondence between model predictions and observed data is a recommended procedure for justifying the application of an IRT model. However, with shorter tests, current goodness-of-fit procedures that assume precise point estimates of ability, are inappropriate. The present paper describes a goodness-of-fit statistic that considers the imprecision with which ability is estimated and involves constructing item fit tables based on each examinee's posterior distribution of ability, given the likelihood of their response pattern and an assumed marginal ability distribution. However, the posterior expectations that are computed are dependent and the distribution of the goodness-of-fit statistic is unknown. The present paper also describes a Monte Carlo resampling procedure that can be used to assess the significance of the fit statistic and compares this method with a previously used method. The results indicate that the method described herein is an effective and reasonably simple procedure for assessing the validity of applying IRT models when ability estimates are imprecise.  相似文献   

13.
王成  孙翠先 《唐山学院学报》2012,25(6):40-41,44
应用MATLAB软件对一组模拟数据进行曲线拟合,建立了三种数学模型,并通过举例介绍了回归分析建模的方法和步骤,解决了离散数据建模的难点问题。  相似文献   

14.
15.
Individual person fit analyses provide important information regarding the validity of test score inferences for an individual test taker. In this study, we use data from an undergraduate statistics test (N = 1135) to illustrate a two-step method that researchers and practitioners can use to examine individual person fit. First, person fit is examined numerically with several indices based on the Rasch model (i.e., Infit, Outfit, and Between-Subset statistics). Second, person misfit is presented graphically with person response functions, and these person response functions are interpreted using a heuristic. Individual person fit analysis holds promise for improving score interpretation in that it may detect potential threats to validity of score inferences for some test takers. Individual person fit analysis may also highlight particular subsets of items (on which a test taker performs unexpectedly) that can be used to further contextualize her or his test performance.  相似文献   

16.
A latent variable modeling procedure for examining whether a studied population could be a mixture of 2 or more latent classes is discussed. The approach can be used to evaluate a single-class model vis-à-vis competing models of increasing complexity for a given set of observed variables without making any assumptions about their within-class interrelationships. The method is helpful in the initial stages of finite mixture analyses to assess whether models with 2 or more classes should be subsequently considered as opposed to a single-class model. The discussed procedure is illustrated with a numerical example.  相似文献   

17.
Linear factor analysis (FA) models can be reliably tested using test statistics based on residual covariances. We show that the same statistics can be used to reliably test the fit of item response theory (IRT) models for ordinal data (under some conditions). Hence, the fit of an FA model and of an IRT model to the same data set can now be compared. When applied to a binary data set, our experience suggests that IRT and FA models yield similar fits. However, when the data are polytomous ordinal, IRT models yield a better fit because they involve a higher number of parameters. But when fit is assessed using the root mean square error of approximation (RMSEA), similar fits are obtained again. We explain why. These test statistics have little power to distinguish between FA and IRT models; they are unable to detect that linear FA is misspecified when applied to ordinal data generated under an IRT model.  相似文献   

18.
In this study, the authors investigated incorporating adjusted model fit information into the root mean square error of approximation (RMSEA) fit index. Through Monte Carlo simulation, the usefulness of this adjusted index was evaluated for assessing model adequacy in structural equation modeling when the multivariate normality assumption underlying maximum likelihood estimation is violated. Adjustment to the RMSEA was considered in 2 forms: a rescaling adjustment via the Satorra-Bentler rescaled goodness-of-fit statistic and a bootstrap adjustment via the Bollen and Stine adjusted model p value. Both properly specified and misspecifed models were examined. The adjusted RMSEA was evaluated in terms of the average index value across study conditions and with respect to model rejection rates under tests of exact fit, close fit, and not-close fit.  相似文献   

19.
In this article I describe and evaluate an alternative baseline model for comparative fit assessment of structural equation models and compare it to the standard “null” baseline model. The new “equal correlation” baseline model constrains all variables to have equal, rather than zero, correlations, whereas all variances are free. The new baseline model reflects the reality of atheoretical background correlation in nonex‐perimental data sets, and it improves the ability of comparative fit indices to distinguish between better and worse target models. It also helps to preserve the statistical link between these indices and the noncentral χ2 distribution. Also, computing the same comparative fit indices using different baseline models will provide more information about model fit than computing multiple comparative fit indices using the same baseline. I also point out some limitations of the proposed baseline model.  相似文献   

20.
Two models can be nonequivalent, but fit very similarly across a wide range of data sets. These near-equivalent models, like equivalent models, should be considered rival explanations for results of a study if they represent plausible explanations for the phenomenon of interest. Prior to conducting a study, researchers should evaluate plausible models that are alternatives to those hypothesized to evaluate whether they are near-equivalent or equivalent and, in so doing, address the adequacy of the study’s methodology. To assess the extent to which alternative models for a study are empirically distinguishable, we propose 5 indexes that quantify the degree of similarity in fit between 2 models across a specified universe of data sets. These indexes compare either the maximum likelihood fit function values or the residual covariance matrices of models. Illustrations are provided to support interpretations of these similarity indexes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号