首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Differential item functioning (DIF) may be caused by an interaction of multiple manifest grouping variables or unexplored manifest variables, which cannot be detected by conventional DIF detection methods that are based on a single manifest grouping variable. Such DIF may be detected by a latent approach using the mixture item response theory model and subsequently explained by multiple manifest variables. This study facilitates the interpretation of latent DIF with the use of background and cognitive variables. The PISA 2009 reading assessment and student survey are analyzed. Results show that members in manifest groups were not homogenously advantaged or disadvantaged and that a single manifest grouping variable did not suffice to be a proxy of latent DIF. This study also demonstrates that DIF items arising from the interaction of multiple variables can be effectively screened by the latent DIF analysis approach. Background and cognitive variables jointly well predicted latent class membership.  相似文献   

2.
Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM‐based data with a linear attribute structure. The study utilizes a procedure to make the IRM and CDM frameworks comparable and investigates how estimation accuracy is affected by test diagnosticity and the match between the true and fitted models. The study shows that comparable results can be obtained when highly diagnostic IRM data are retrofitted with CDM, and vice versa, retrofitting CDMs to IRM‐based data in some conditions can result in considerable examinee misclassification, and model fit indices provide limited indication of the accuracy of item parameter estimation and attribute classification.  相似文献   

3.
The hierarchical generalized linear model (HGLM) is presented as an explicit, two-level formulation of a multilevel item response model. In this paper, it is shown that the HGLM is equivalent to the Rasch model and that, characteristic of the HGLM, person ability can be expressed in the form of random effects rather than parameters. The two-level item analysis model is presented as a latent regression model with person-characteristic variables. Furthermore, it is shown that the two-level HGLM model can be extended to a three-level latent regression model that permits investigation of the variation of students' performance across groups, such as is found in classrooms and schools, and of the interactive effect of person-and group-characteristic variables.  相似文献   

4.
Achievement modeling is carried out in groups of students characterized by heterogeneous instructional background. Extensions of item response theory models incorporate variables reflecting different amounts of opportunity-to-leam (OTL). The effects of these OTL variables are studied with respect to their influence on both the latent trait and the item performance directly. Such direct effects may reflect instructionally sensitive items. U.S. eighth-grade mathematics data from the Second International Mathematics Study are analyzed. Here, the same test is taken by students enrolled in typical instruction and students enrolled in elementary algebra classes. It is shown that the new analysis provides a more detailed way to examine the influence of instruction on responses to test items than does conventional item response theory.  相似文献   

5.
When missingness is suspected to be not at random (MNAR) in longitudinal studies, researchers sometimes compare the fit of a target model that assumes missingness at random (here termed a MAR model) and a model that accommodates a hypothesized MNAR missingness mechanism (here termed a MNAR model). It is well known that such comparisons are only interpretable conditional on the validity of the chosen MNAR model’s assumptions about the missingness mechanism. For that reason, researchers often perform a sensitivity analysis comparing the MAR model to not one, but several, plausible alternative MNAR models. In the social sciences, it is not widely known that such model comparisons can be particularly sensitive to case influence, such that conclusions drawn could depend on a single case. This article describes two convenient diagnostics suited for detecting case influence on MAR–MNAR model comparisons. Both diagnostics require much less computational burden than global influence diagnostics that have been used in other disciplines for MNAR sensitivity analyses. We illustrate the interpretation and implementation of these diagnostics with simulated and empirical latent growth modeling examples. It is hoped that this article increases awareness of the potential for case influence on MAR–MNAR model comparisons and how it could be detected in longitudinal social science applications.  相似文献   

6.
The assessment of differential item functioning (DIF) is routinely conducted to ensure test fairness and validity. Although many DIF assessment methods have been developed in the context of classical test theory and item response theory, they are not applicable for cognitive diagnosis models (CDMs), as the underlying latent attributes of CDMs are multidimensional and binary. This study proposes a very general DIF assessment method in the CDM framework which is applicable for various CDMs, more than two groups of examinees, and multiple grouping variables that are categorical, continuous, observed, or latent. The parameters can be estimated with Markov chain Monte Carlo algorithms implemented in the freeware WinBUGS. Simulation results demonstrated a good parameter recovery and advantages in DIF assessment for the new method over the Wald method.  相似文献   

7.
This article presents a methodology for examining the content and nature of item parcels as indicators of a conceptually defined latent construct. An essential component of this methodology is the 2-facet measurement model, which includes items and parcels as facets of construct indicators. The 2-facet model tests assumptions required for accepting parcels as aggregates of item covariation in representing the latent construct. According to this methodology, parcels are acceptable indicators of the latent construct if the 2-facet model meets parametric assumptions for unidimensionality and if items and parcels have content validity as measures of the latent construct. The proposed methodology is illustrated using a 1-factor model of the Worry construct in the test anxiety measurement tradition  相似文献   

8.
Using a sample of schools testing annually in grades 9–11 with a vertically linked series of assessments, a latent growth curve model is used to model test scores with student intercepts and slopes nested within school. Missed assessments can occur because of student mobility, student dropout, absenteeism, and other reasons. Missing data indicators are modeled using logistic regression, with grade 9 and potentially unobserved growth scores used as covariates. Under a hierarchical selection model, estimates of school effects on academic growth and missingness are obtained. The results from the selection model are compared to a model that ignores the missing data process.  相似文献   

9.
Latent class models of decisionmaking processes related to multiple-choice test items are extremely important and useful in mental test theory. However, building realistic models or studying the robustness of existing models is very difficult. One problem is that there are a limited number of empirical studies that address this issue. The purpose of this paper is to describe and illustrate how latent class models, in conjunction with the answer-until-correct format, can be used to examine the strategies used by examinees for a specific type of task. In particular, suppose an examinee responds to a multiple-choice test item designed to measure spatial ability, and the examinee gets the item wrong. This paper empirically investigates various latent class models of the strategies that might be used to arrive at an incorrect response. The simplest model is a random guessing model, but the results reported here strongly suggest that this model is unsatisfactory. Models for the second attempt of an item, under an answer-until-correct scoring procedure, are proposed and found to give a good fit to data in most situations. Some results on strategies used to arrive at the first choice are also discussed  相似文献   

10.
The latent class reliability coefficient (LCRC) is improved by using the divisive latent class model instead of the unrestricted latent class model. This results in the divisive latent class reliability coefficient (DLCRC), which unlike LCRC avoids making subjective decisions about the best solution and thus avoids judgment error. A computational study using large numbers of items shows that DLCRC also is faster than LCRC and fast enough for practical purposes. Speed and objectivity render DLCRC superior to LCRC. A decisive feature of DLCRC is that it aims at closely approximating the multivariate distribution of item scores, which might render the method suited when test data are multidimensional. A simulation study focusing on multidimensionality shows that DLCRC in general has little bias relative to the true reliability and is relatively accurate compared to LCRC and classical lower bound methods coefficients α and λ2 and the greatest lower bound.  相似文献   

11.
The relations between the latent variables in structural equation models are typically assumed to be linear in form. This article aims to explain how a specification error test using instrumental variables (IVs) can be employed to detect unmodeled interactions between latent variables or quadratic effects of latent variables. An empirical example is presented, and the results of a simulation study are reported to evaluate the sensitivity and specificity of the test and compare it with the commonly employed chi-square model test. The results show that the proposed test can identify most unmodeled latent interactions or latent quadratic effects in moderate to large samples. Furthermore, its power is higher when the number of indicators used to define the latent variables is large. Altogether, this article shows how the IV-based test can be applied to structural equation models and that it is a valuable tool for researchers using structural equation models.  相似文献   

12.
This article examines whether Bayesian estimation with minimally informed prior distributions can alleviate the estimation problems often encountered with fitting the true score multitrait–multimethod structural equation model with split-ballot data. In particular, the true score multitrait–multimethod structural equation model encounters an empirical underidentification when (a) latent variable correlations are homogenous, and (b) fitted to data from a 2-group split-ballot design; an understudied case of empirical underidentification due to a planned missingness (i.e., split-ballot) design. A Monte Carlo simulation and 3 empirical examples showed that Bayesian estimation performs better than maximum likelihood (ML) estimation. Therefore, we suggest using Bayesian estimation with minimally informative prior distributions when estimating the true score multitrait–multimethod structural equation model with split-ballot data. Furthermore, given the increase in planned missingness designs in psychological research, we also suggest using Bayesian estimation as a potential alternative to ML estimation for analyses using data from planned missingness designs.  相似文献   

13.
The reading data from the 1983–84 National Assessment of Educational Progress survey were scaled using a unidimensional item response theory model. To determine whether the responses to the reading items were consistent with unidimensionality, the full-information factor analysis method developed by Bock and associates (1985) and Rosenbaum's (1984) test of unidimensionality, conditional (local) independence, and monotonicity were applied. Full-information factor analysis involves the assumption of a particular item response function; the number of latent variables required to obtain a reasonable fit to the data is then determined. The Rosenbaum method provides a test of the more general hypothesis that the data can be represented by a model characterized by unidimensionality, conditional independence, and monotonicity. Results of both methods indicated that the reading items could be regarded as measures of a single dimension. Simulation studies were conducted to investigate the impact of balanced incomplete block (BIB) spiraling, used in NAEP to assign items to students, on methods of dimensionality assessment. In general, conclusions about dimensionality were the same for BIB-spiraled data as for complete data.  相似文献   

14.
We consider a general type of model for analyzing ordinal variables with covariate effects and 2 approaches for analyzing data for such models, the item response theory (IRT) approach and the PRELIS-LISREL (PLA) approach. We compare these 2 approaches on the basis of 2 examples, 1 involving only covariate effects directly on the ordinal variables and 1 involving covariate effects on the latent variables in addition.  相似文献   

15.
In structural equation modeling software, either limited-information (bivariate proportions) or full-information item parameter estimation routines could be used for the 2-parameter item response theory (IRT) model. Limited-information methods assume the continuous variable underlying an item response is normally distributed. For skewed and platykurtic latent variable distributions, 3 methods were compared in Mplus: limited information, full information integrating over a normal distribution, and full information integrating over the known underlying distribution. Interfactor correlation estimates were similar for all 3 estimation methods. For the platykurtic distribution, estimation method made little difference for the item parameter estimates. When the latent variable was negatively skewed, for the most discriminating easy or difficult items, limited-information estimates of both parameters were considerably biased. Full-information estimates obtained by marginalizing over a normal distribution were somewhat biased. Full-information estimates obtained by integrating over the true latent distribution were essentially unbiased. For the a parameters, standard errors were larger for the limited-information estimates when the bias was positive but smaller when the bias was negative. For the d parameters, standard errors were larger for the limited-information estimates of the easiest, most discriminating items. Otherwise, they were generally similar for the limited- and full-information estimates. Sample size did not substantially impact the differences between the estimation methods; limited information did not gain an advantage for smaller samples.  相似文献   

16.
Item response theory (IRT) procedures have been used extensively to study normal latent trait distributions and have been shown to perform well; however, less is known concerning the performance of IRT with non-normal latent trait distributions. This study investigated the degree of latent trait estimation error under normal and non-normal conditions using four latent trait estimation procedures and also evaluated whether the test composition, in terms of item difficulty level, reduces estimation error. Most importantly, both true and estimated item parameters were examined to disentangle the effects of latent trait estimation error from item parameter estimation error. Results revealed that non-normal latent trait distributions produced a considerably larger degree of latent trait estimation error than normal data. Estimated item parameters tended to have comparable precision to true item parameters, thus suggesting that increased latent trait estimation error results from latent trait estimation rather than item parameter estimation.  相似文献   

17.
Multiple-choice reading comprehension items from a conventional, norm-referenced reading comprehension test are successfully analyzed using a simple latent class model. A classification rule for assigning respondents to "mastery" or "nonmastery" states is presented which simplifies the scoring procedure of Macready and Dayton (1977). A procedure is also derived for estimating the "true," or "disattenuated," latent cross-classification of masters versus nonmasters for two tests, and illustrated using two sets of items from the same content domain. Results support the use of latent class, state mastery models with more heterogeneous item pools than has been advocated by previous authors.  相似文献   

18.
作为标准参照型的水平性考试,全国英语等级考试(PETS)理应确保诸考次考试标准的统一、稳定。本文着重介绍了PETS为保持考试标准稳定而采取的一些策略:选取考试设计所依据的理论模型、保障试题命制的质量、强化考试的等值处理、控制主观性试题的评分误差,这些策略有效地保证了PETS考试的科学性和规范性。  相似文献   

19.
To better understand the statistical properties of the deterministic inputs, noisy “and” gate cognitive diagnosis (DINA) model, the impact of several factors on the quality of the item parameter estimates and classification accuracy was investigated. Results of the simulation study indicate that the fully Bayes approach is most accurate when the prior distribution matches the latent class structure. However, when the latent classes are of indefinite structure, the empirical Bayes method in conjunction with an unstructured prior distribution provides much better estimates and classification accuracy. Moreover, using empirical Bayes with an unstructured prior does not lead to extremely poor results as other prior-estimation method combinations do. The simulation results also show that increasing the sample size reduces the variability, and to some extent the bias, of item parameter estimates, whereas lower level of guessing and slip parameter is associated with higher quality item parameter estimation and classification accuracy.  相似文献   

20.
This study discusses a procedure for testing the equivalence among different item response formats used in personality and attitude measurement. The procedure is based on the assumption that latent response variables underlie the observed item responses (underlying variables approach) and uses a nested series of confirmatory factor analysis models derived from Joreskog's (1971) method for estimating the dissatenuated correlation. The different stages of the procedure are illustrated using real data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号