首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The psychometric literature provides little empirical evaluation of examinee test data to assess essential psychometric properties of innovative items. In this study, examinee responses to conventional (e.g., multiple choice) and innovative item formats in a computer-based testing program were analyzed for IRT information with the three-parameter and graded response models. The innovative item types considered in this study provided more information across all levels of ability than multiple-choice items. In addition, accurate timing data captured via computer administration were analyzed to consider the relative efficiency of the multiple choice and innovative item types. As with previous research, multiple-choice items provide more information per unit time. Implications for balancing policy, psychometric, and pragmatic factors in selecting item formats are also discussed.  相似文献   

2.
Measuring academic growth, or change in aptitude, relies on longitudinal data collected across multiple measurements. The National Educational Longitudinal Study (NELS:88) is among the earliest, large-scale, educational surveys tracking students’ performance on cognitive batteries over 3 years. Notable features of the NELS:88 data set, and of almost all repeated measures educational assessments, are (a) the outcome variables are binary or at least categorical in nature; and (b) a set of different items is given at each measurement occasion with a few anchor items to fix the measurement scale. This study focuses on the challenges related to specifying and fitting a second-order longitudinal model for binary outcomes, within both the item response theory and structural equation modeling frameworks. The distinctions between and commonalities shared between these two frameworks are discussed. A real data analysis using the NELS:88 data set is presented for illustration purposes.  相似文献   

3.
This study examined a psychosocial mechanism of how general self-efficacy interacts with other key factors and influences degree aspiration for students enrolled in an urban diverse community college. Using general self-efficacy scales, the authors hypothesized the General Self-efficacy model for Community College students (the GSE-CC model). A Confirmatory factor analysis was used to establish a measurement model in which three general self-efficacies were confirmed along with other latent factors (e.g., social capital, transfer capital, etc.). The GSE-CC model was then tested and finalized via structural equation modeling (SEM) techniques. The results showed that general self-efficacy significantly impacted the degree aspiration both directly and indirectly. In addition, general self-efficacy may serve as a bridge between social capital and transfer capital for community college students. Based on the findings, community college practitioners can generate practical implications to promote positive general self-efficacy among students. Further studies were encouraged to adopt/modify the GSE-CC model and test it across different student groups.  相似文献   

4.
为比较结构方程模型和 IRT等级反应模型在人格量表项目筛选上的作用,以《中国大学生人格量表》的7229个实际测量数据为基础,针对因素二“爽直”分别以Lisrel8.70和Multilog7.03进行结构方程模型和等级反应模型的参数估计与拟合,比较两种方法的项目筛选结果.二者统计结果均认为项目5、6、7、8拟合度不佳,在结构方程模型上表现为因子负荷较低,整体拟合指数不理想;在等级反应模型上表现为区分度参数和位置参数不理想,相关项目的特征曲线和信息曲线形态较差.但结构方程模型倾向于项目6、8更差,而等级反应模型则倾向于项目5、6更差.结构方程模型和 IRT等级反应模型对人格量表项目的统计推断结果从总体上讲是一致的,但在个别项目上略有差异.二者各有优势,可以结合使用.  相似文献   

5.
In many intervention and evaluation studies, outcome variables are assessed using a multimethod approach comparing multiple groups over time. In this article, we show how evaluation data obtained from a complex multitrait–multimethod–multioccasion–multigroup design can be analyzed with structural equation models. In particular, we show how the structural equation modeling approach can be used to (a) handle ordinal items as indicators, (b) test measurement invariance, and (c) test the means of the latent variables to examine treatment effects. We present an application to data from an evaluation study of an early childhood prevention program. A total of 659 children in intervention and control groups were rated by their parents and teachers on prosocial behavior and relational aggression before and after the program implementation. No mean change in relational aggression was found in either group, whereas an increase in prosocial behavior was found in both groups. Advantages and limitations of the proposed approach are highlighted.  相似文献   

6.
本研究是关于项目形式对测量效果的影响研究。研究结果显示,在题干等价的条件下,填空形式的难度一般高于多选一形式;两种形式在区分度上没有显著差异,如果能给出恰当的选择项,多选一形式的区分度可能会高于填空形式。同时,两种项目形式所测量能力的维度差异不大,但对于较低能力层的被试,多选一形式的测量效果相对较好,而对于较高能力层的被试,则填空形式的测量效果比较好。  相似文献   

7.
The sample invariance of item discrimination statistics is evaluated in this case study using real data. The hypothesized superiority of the item response model (IRM) is tested against structural equation modeling (SEM) for responses to the Center for Epidemiologic Studies-Depression (CES-D) scale. Responses from 10 random samples of 500 people were drawn from a base sample of 6,621 participants across gender, age, and different health groups. Hierarchical tests of multiple-group structural equation models indicated statistically significant differences exist in item regressions across contrast groups. Although the IRM item discrimination estimates were most stable in all conditions of this case study, additional research on the precision of individual scores and possible item bias is required to support the validity of either model for scoring the CES-D. The SEM approach to examining between-group differences holds promise for any field where heterogeneous populations are assessed and important consequences arise from score interpretations.  相似文献   

8.
Both structural equation modeling (SEM) and item response theory (IRT) can be used for factor analysis of dichotomous item responses. In this case, the measurement models of both approaches are formally equivalent. They were refined within and across different disciplines, and make complementary contributions to central measurement problems encountered in almost all empirical social science research fields. In this article (a) fundamental formal similiarities between IRT and SEM models are pointed out. It will be demonstrated how both types of models can be used in combination to analyze (b) the dimensional structure and (c) the measurement invariance of survey item responses. All analyses are conducted with Mplus, which allows an integrated application of both approaches in a unified, general latent variable modeling framework. The aim is to promote a diffusion of useful measurement techniques and skills from different disciplines into empirical social research.  相似文献   

9.
A great obstacle for wider use of structural equation modeling (SEM) has been the difficulty in handling categorical variables. Two data sets with known structure between 2 related binary outcomes and 4 independent binary variables were generated. Four SEM strategies and resulting apparent validity were tested: robust maximum likelihood (ML), tetrachoric correlation matrix input followed by SEM ML analysis, SEM ML estimation for the sum of squares and cross-products (SSCP) matrix input obtained by the log-linear model that treated all variables as dependent, and asymptotic distribution-free (ADF) SEM estimation. SEM based on the SSCP matrix obtained by the log-linear model and SEM using robust ML estimation correctly identified the structural relation between the variables. SEM using ADF added an extra parameter. SEM based on tetrachoric correlation input did not specify the data generating process correctly. Apparent validity was similar for all models presented. Data transformation used in log-linear modeling can serve as an input for SEM.  相似文献   

10.
The assumption of conditional independence between the responses and the response times (RTs) for a given person is common in RT modeling. However, when the speed of a test taker is not constant, this assumption will be violated. In this article we propose a conditional joint model for item responses and RTs, which incorporates a covariance structure to explain the local dependency between speed and accuracy. To obtain information about the population of test takers, the new model was embedded in the hierarchical framework proposed by van der Linden ( 2007 ). A fully Bayesian approach using a straightforward Markov chain Monte Carlo (MCMC) sampler was developed to estimate all parameters in the model. The deviance information criterion (DIC) and the Bayes factor (BF) were employed to compare the goodness of fit between the models with two different parameter structures. The Bayesian residual analysis method was also employed to evaluate the fit of the RT model. Based on the simulations, we conclude that (1) the new model noticeably improves the parameter recovery for both the item parameters and the examinees’ latent traits when the assumptions of conditional independence between the item responses and the RTs are relaxed and (2) the proposed MCMC sampler adequately estimates the model parameters. The applicability of our approach is illustrated with an empirical example, and the model fit indices indicated a preference for the new model.  相似文献   

11.
Structural equation models are typically evaluated on the basis of goodness-of-fit indexes. Despite their popularity, agreeing what value these indexes should attain to confidently decide between the acceptance and rejection of a model has been greatly debated. A recently proposed approach by means of equivalence testing has been recommended as a superior way to evaluate the goodness of fit of models. The approach has also been proposed as providing a necessary vehicle that can be used to advance the inferential nature of structural equation modeling as a confirmatory tool. The purpose of this article is to introduce readers to key ideas in equivalence testing and illustrate its use for conducting model–data fit assessments. Two confirmatory factor analysis models in which a priori specified latent variable models with known structure and tested against data are used as examples. It is advocated that whenever the goodness of fit of a model is to be assessed researchers should always examine the resulting values obtained via the equivalence testing approach.  相似文献   

12.
Structural equation models with interaction and quadratic effects have become a standard tool for testing nonlinear hypotheses in the social sciences. Most of the current approaches assume normally distributed latent predictor variables. In this article, we describe a nonlinear structural equation mixture approach that integrates the strength of parametric approaches (specification of the nonlinear functional relationship) and the flexibility of semiparametric structural equation mixture approaches for approximating the nonnormality of latent predictor variables. In a comparative simulation study, the advantages of the proposed mixture procedure over contemporary approaches [Latent Moderated Structural Equations approach (LMS) and the extended unconstrained approach] are shown for varying degrees of skewness of the latent predictor variables. Whereas the conventional approaches show either biased parameter estimates or standard errors of the nonlinear effects, the proposed mixture approach provides unbiased estimates and standard errors. We present an empirical example from educational research. Guidelines for applications of the approaches and limitations are discussed.  相似文献   

13.
Multilevel Structural equation models are most often estimated from a frequentist framework via maximum likelihood. However, as shown in this article, frequentist results are not always accurate. Alternatively, one can apply a Bayesian approach using Markov chain Monte Carlo estimation methods. This simulation study compared estimation quality using Bayesian and frequentist approaches in the context of a multilevel latent covariate model. Continuous and dichotomous variables were examined because it is not yet known how different types of outcomes—most notably categorical—affect parameter recovery in this modeling context. Within the Bayesian estimation framework, the impact of diffuse, weakly informative, and informative prior distributions were compared. Findings indicated that Bayesian estimation may be used to overcome convergence problems and improve parameter estimate bias. Results highlight the differences in estimation quality between dichotomous and continuous variable models and the importance of prior distribution choice for cluster-level random effects.  相似文献   

14.
Computerized adaptive testing offers the possibility of gaining information on both the overall ability and cognitive profile in a single assessment administration. Some algorithms aiming for these dual purposes have been proposed, including the shadow test approach, the dual information method (DIM), and the constraint weighted method. The current study proposed two new methods, aggregate ranked information index (ARI) and aggregate standardized information index (ASI), which appropriately addressed the noncompatibility issue inherent in the original DIM method. More flexible weighting schemes that put different emphasis on information about general ability (i.e., in item response theory) and information about cognitive profile (i.e., in cognitive diagnostic modeling) were also explored. Two simulation studies were carried out to investigate the effectiveness of the new methods and weighting schemes. Results showed that the new methods with the flexible weighting schemes could produce more accurate estimation of both overall ability and cognitive profile than the original DIM. Among them, the ASI with both empirical and theoretical weights is recommended, and attribute‐level weighting scheme is preferred if some attributes are considered more important from a substantive perspective.  相似文献   

15.
This article presents a didactic discussion of a multilevel covariance structure modeling approach to estimation of lowest level mediation effect indexes in two-level studies. The procedure is useful when addressing questions about relations among total and indirect effects between variables of interest while accounting for the hierarchical structure of analyzed data. The discussed method also permits interval estimation and hypothesis tests with respect to related quantities of relevance when evaluating mediated effects with clustered data, and is illustrated on a two-level data set.  相似文献   

16.
A structural equation modeling method for examining time-invariance of variable specificity in longitudinal studies with multiple measures is outlined, which is developed within a confirmatory factor-analytic framework. The approach represents a likelihood ratio test for the hypothesis of stability in the specificity part of the residual term associated with repeated administration of each measure. The procedure can be used in the search for parsimonious versions of multiwave multiple-indicator models, to test for variable specificity in them, and to examine assumptions underlying particular parameter estimation procedures in repeated measure designs. The outlined method is illustrated with empirical data.  相似文献   

17.
How has Item Response Theory helped solve problems in the development and use of computer-adaptive tests? Do we need to balance item content with computer-adaptive tests? Could we use IRT to evaluate unusual responses to computer-delivered tests?  相似文献   

18.
Cluster sampling results in response variable variation both among respondents (i.e., within-cluster or Level 1) and among clusters (i.e., between-cluster or Level 2). Properly modeling within- and between-cluster variation could be of substantive interest in numerous settings, but applied researchers typically test only within-cluster (i.e., individual difference) theories. Specifying a between-cluster model in the absence of theory requires a specification search in multilevel structural equation modeling. This study examined a variety of within-cluster and between-cluster sample sizes, intraclass correlation coefficients, start models, parameter addition and deletion methods, and Type I error control techniques to identify which combination of start model, parameter addition or deletion method, and Type I error control technique best recovered the population of the between-cluster model. Results indicated that a “saturated” start model, univariate parameter deletion technique, and no Type I error control performed best, but recovered the population between-cluster model in less than 1 in 5 attempts at the largest sample sizes. The accuracy of specification search methods, suggestions for applied researchers, and future research directions are discussed.  相似文献   

19.
This study explored the occurrence of self-concsious emotions in response to perceived academic failure among 4th-grade students from the United States and Bulgaria, and the author investigated potential contributors to such negative emotional experiences. Results from structural equation modeling indicated that regardless of country, negative affectivity—as an individual predisposition to experience highly negative emotions—predicted self-conscious emotions toward academic failure. However, culture appeared to condition the relative importance of some family process variables in children's experiences of self-consious emotions. Bulgarian children's emotional experiences were amplified by the negative valence of their parents’ evaluative feedback in the aftermath of academic failure. In contrast, U.S. children's perceptions of failure appeared to be less influenced by their parents’ judgments. The findings of the study are interpreted in the light of cultural differences.  相似文献   

20.
The relation among fit indexes, power, and sample size in structural equation modeling is examined. The noncentrality parameter is required to compute power. The 2 existing methods of computing power have estimated the noncentrality parameter by specifying an alternative hypothesis or alternative fit. These methods cannot be implemented easily and reliably. In this study, 4 fit indexes (RMSEA, CFI, McDonald's Fit Index, and Steiger's gamma) were used to compute the noncentrality parameter and sample size to achieve certain level of power. The resulting power and sample size varied as a function of (a) choice of fit index, (b) number of variables/degrees of freedom, (c) relation among the variables, and (d) value of the fit index. However, if the level of misspecification were held constant, then the resulting power and sample size would be identical.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号