首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 109 毫秒
随着多级计分在心理和教育领域中日益广泛的应用,对检验项目功能差异(DIF)的方法提出新的挑战。已有研究表明,在检验DIF的方法中,MIMIC是一种经济有效的检验方法,然而还没有研究系统地分析MIMIC方法在多级计分项目中的有效性。本研究通过蒙特卡洛实验,探讨参照组与目标组的样本容量、DIF类别、项目区分度、组间能力差异和在锚题中存在的DIF题量5个因素,并在这些因素不同情况的组合中分析MIMIC方法的第一类错误率和检验力。研究发现:1)MIMIC是一种能够灵敏地检验一致性DIF的方法,即使在目标组样本容量较小或明显小于参照组的情况下,它仍然能很好地控制第一类错误率;2)纯化步骤对MIMIC方法控制第一类错误率、提高检验力是有必要的,但MIMIC方法对污染程度又有一定的容忍性;3)检验力受到低区分度的严重影响,但太高的区分度又会导致第一类错误率的增加;4)MIMIC方法对一致性DIF的检验力随着样本容量的增大而增大。  相似文献   

在认知诊断模型中进行题目功能差异(DIF)的检测,目的在于保证测验的质量与效果。在以往研究的基础上,本研究重点探索在CDMs框架下,MH、LR、CSIBTEST、WObs、WSw、WXPD 6种DIF检测方法在Q矩阵是否正确设定以及有关DIF影响因素等条件下的表现。结果表明:在Q矩阵正确设定时,WObs、WSw和WXPD统计量表现要好于MH、LR和CSIBTEST方法;在Q矩阵错误设定时,6种方法都会出现Ⅰ类错误率膨胀和统计检验力较低的现象。相对而言,MH、LR和CSIBTEST方法的表现比较稳定,WObs、WSw和WXPD统计量的表现变化较大,WObs、WSw和WXPD统计量的Ⅰ类错误率和统计检验力的结果依然好于MH、LR、CSIBTEST方法。  相似文献   

样本量的确定是研究设计的重要环节,因为它直接影响到统计效力和统计结论效度。鉴于外语教学研究者对确定合适样本量的重要性缺乏充分的认识,以方差分析为例介绍决定研究所需样本量的三个参数——第一类错误率、统计效力和效应量(或非中心参数),同时探讨这些参数之间的关系及其对样本量的影响,并进一步讨论估计总体效应量的常用手段和计算研究所需样本量的常用方法。  相似文献   

在统计检验中,会犯两种类型的错误:第一类错误与第二类错误。以单个均值的检验为例,分析了它们的成因和计算方法。可以认为,第一类错误由检验中的实际推断原理引起,第二类错误由检验中的逻辑谬误引起。第一类错误出现的概率为显著性水平α,即小概率事件发生的概率。第二类错误的计算方法是阐述的重点,也是在解决这一问题上与目前的方法不一致的地方。  相似文献   

在美国,各个考试公司都会用不同的统计方法来检测考试中的舞弊现象。本文研究了两个检测舞弊的指数:基于经典考试理论的g2指数和基于项目反应理论的w指数。文章模拟了四种真实测试情形中常见的抄袭模式和几个可能影响指数的变量,研究结果表明,对于g2和w指数,在各种情形下,按照有偏差的估计参数以及真实参数计算出来的第一类错误率都是类似的,并且较低。因此,用有偏差的估计参数来计算g2和w指数不会增加将被抄袭者误认为抄袭者的可能性。而基于有偏差的估计参数的g2和w指数,只有在抄袭题目百分比较高且测试长度较长的情况下,才可能实现较低的第二类错误率。当抄袭题目百分比较低时,即便使用真实参数,g2和w指数都会造成较高的第二类错误率。  相似文献   

积分方程分为第一类和第二类积分方程,第一类积分方程是不适定的,一般利用Tikhonov正则化方法和Backus-Gilbert方法求解,而矩量法不仅适应于第二类积分方程,而且也适应于第一类积分方程。此外,利用矩量法求解含有奇性核的第一类积分方程,并给出了一个数值例子。  相似文献   

正态性检验方法在教学研究中的应用   总被引:1,自引:0,他引:1  
针对目前很多研究者在进行正态性检验时仅会依据自己的习惯或喜好来选择方法这一状况,文章从常用方法中选取Jarque-Bera检验、Shapiro-Wilk检验、D'Agostino检验、KolmogorovSmirnov检验以及Lilliefors检验这五种正态性检验方法进行简要论述,利用Monte Carlo法分析比较五种检验在不同样本量的不同分布下的检验功效或Ⅰ型错误率,再结合SAS、SPSS和R这三种常用的教学统计软件,讨论正态性检验方法的选取问题,以期为科研工作者选择正态性检验方法时提供参考。  相似文献   

指数分布多个异常数据的检验   总被引:2,自引:2,他引:0  
利用样本分位数构造检验统计量,给出来自于指数分布总体异常数据的一种检测方法.求出了检验统计量精确的概率密度函数和大样本情形下的近似分布,从而得到了检验临界值简洁的近似表达式.检验统计量中的核心统计量——样本分位数,对于异常数据的干扰具有一定的抵抗力,因此该方法可有效地达到检测效果.  相似文献   

文章首先介绍了R/S分析方法的基本原理,在此基础上介绍了修正的R/S统计量。然后采用修正的R/S分析法,选取深圳成份指数的日收盘指数序列为样本数据,对其记忆特征进行检验,得出深圳成指日绝对收益率序列和日收益率平方的序列都存在一定的长期相关性的结论。  相似文献   

Ⅰ型极小值分布样本异常数据的检验   总被引:1,自引:1,他引:0  
针对Ⅰ型极小值分布样本的多个异常数据,提出了一种新的检验方法.首先寻找到总体参数的具有较好稳健性的估计量,然后在此基础上构造出检验统计量,进一步求出了该检验统计量精确的概率密度函数和大样本情形下的近似分布.由于检验统计量中的核心统计量——样本分位数,对于异常数据的干扰具有一定的抵抗力,因此利用该方法可以达到有效的检验效果.  相似文献   

This article used the Wald test to evaluate the item‐level fit of a saturated cognitive diagnosis model (CDM) relative to the fits of the reduced models it subsumes. A simulation study was carried out to examine the Type I error and power of the Wald test in the context of the G‐DINA model. Results show that when the sample size is small and a larger number of attributes are required, the Type I error rate of the Wald test for the DINA and DINO models can be higher than the nominal significance levels, while the Type I error rate of the A‐CDM is closer to the nominal significance levels. However, with larger sample sizes, the Type I error rates for the three models are closer to the nominal significance levels. In addition, the Wald test has excellent statistical power to detect when the true underlying model is none of the reduced models examined even for relatively small sample sizes. The performance of the Wald test was also examined with real data. With an increasing number of CDMs from which to choose, this article provides an important contribution toward advancing the use of CDMs in practical educational settings.  相似文献   

Confidence intervals (CIs) for parameters are usually constructed based on the estimated standard errors. These are known as Wald CIs. This article argues that likelihood-based CIs (CIs based on likelihood ratio statistics) are often preferred to Wald CIs. It shows how the likelihood-based CIs and the Wald CIs for many statistics and psychometric indexes can be constructed with the use of phantom variables (Rindskopf, 1984 Rindskopf, D. 1984. Using phantom and imaginary latent variables to parameterize constraints in linear structural models. Psychometrika, 49: 3747. [Crossref], [Web of Science ®] [Google Scholar]) in some of the current structural equation modeling (SEM) packages. The procedures to form CIs for the differences in correlation coefficients, squared multiple correlations, indirect effects, coefficient alphas, and reliability estimates are illustrated. A simulation study on the Pearson correlation is used to demonstrate the advantages of the likelihood-based CI over the Wald CI. Issues arising from this SEM approach and extensions of this approach are discussed.  相似文献   

Analyzing examinees’ responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study explored the effectiveness of the Wald test in detecting both uniform and nonuniform DIF in the DINA model through a simulation study. Results of this study suggest that for relatively discriminating items, the Wald test had Type I error rates close to the nominal level. Moreover, its viability was underscored by the medium to high power rates for most investigated DIF types when DIF size was large. Furthermore, the performance of the Wald test in detecting uniform DIF was compared to that of the traditional Mantel‐Haenszel (MH) and SIBTEST procedures. The results of the comparison study showed that the Wald test was comparable to or outperformed the MH and SIBTEST procedures. Finally, the strengths and limitations of the proposed method and suggestions for future studies are discussed.  相似文献   

Lord's Wald test for differential item functioning (DIF) has not been studied extensively in the context of the multidimensional item response theory (MIRT) framework. In this article, Lord's Wald test was implemented using two estimation approaches, marginal maximum likelihood estimation and Bayesian Markov chain Monte Carlo estimation, to detect uniform and nonuniform DIF under MIRT models. The Type I error and power rates for Lord's Wald test were investigated under various simulation conditions, including different DIF types and magnitudes, different means and correlations of two ability parameters, and different sample sizes. Furthermore, English usage data were analyzed to illustrate the use of Lord's Wald test with the two estimation approaches.  相似文献   

The purpose of this study was to investigate multidimensional DIF with a simple and nonsimple structure in the context of multidimensional Graded Response Model (MGRM). This study examined and compared the performance of the IRT-LR and Wald test using MML-EM and MHRM estimation approaches with different test factors and test structures in simulation studies and applying real data sets. When the test structure included two dimensions, the IRT-LR (MML-EM) generally performed better than the Wald test and provided higher power rates. If the test included three dimensions, the methods provided similar performance in DIF detection. In contrast to these results, when the number of dimensions in the test was four, MML-EM estimation completely lost precision in estimating the nonuniform DIF, even with large sample sizes. The Wald with MHRM estimation approaches outperformed the Wald test (MML-EM) and IRT-LR (MML-EM). The Wald test had higher power rate and acceptable type I error rates for nonuniform DIF with the MHRM estimation approach.The small and/or unbalanced sample sizes, small DIF magnitudes, unequal ability distributions between groups, number of dimensions, estimation methods and test structure were evaluated as important test factors for detecting multidimensional DIF.  相似文献   

This study examined and compared various statistical methods for detecting individual differences in change. Considering 3 issues including test forms (specific vs. generalized), estimation procedures (constrained vs. unconstrained), and nonnormality, we evaluated 4 variance tests including the specific Wald variance test, the generalized Wald variance test, the specific likelihood ratio (LR) variance test, and the generalized LR variance test under both constrained and unconstrained estimation for both normal and nonnormal data. For the constrained estimation procedure, both the mixture distribution approach and the alpha correction approach were evaluated for their performance in dealing with the boundary problem. To deal with the nonnormality issue, we used the sandwich standard error (SE) estimator for the Wald tests and the Satorra–Bentler scaling correction for the LR tests. Simulation results revealed that testing a variance parameter and the associated covariances (generalized) had higher power than testing the variance solely (specific), unless the true covariances were zero. In addition, the variance tests under constrained estimation outperformed those under unconstrained estimation in terms of higher empirical power and better control of Type I error rates. Among all the studied tests, for both normal and nonnormal data, the robust generalized LR and Wald variance tests with the constrained estimation procedure were generally more powerful and had better Type I error rates for testing variance components than the other tests. Results from the comparisons between specific and generalized variance tests and between constrained and unconstrained estimation were discussed.  相似文献   

To date, no effective empirical method has been available to identify a truly invariant reference variable (RV) in testing measurement invariance under a multiple-group confirmatory factor analysis. This study proposes a method that, in selecting an RV, uses the smallest modification index (min-mod). The method’s performance is evaluated using 2 models: (a) a full invariance model, and (b) a partial invariance model. Results indicate that for both models the min-mod successfully identifies a truly invariant RV (Study 1). In Study 2, we use the RV found in Study 1 to further evaluate the performance of item-by-item Wald tests at locating a noninvariant variable. The results indicate that Wald tests overall performed better with an RV selected in a partial invariance model than an RV selected in a full invariance model, although in certain conditions their performances were rather similar. Implications and limitations of the study are also discussed.  相似文献   

Employing a Wald confidence interval to test hypotheses about population proportions could lead to an increase in Type I or Type II errors unless the hypothesized value, p0, is used in computing its standard error rather than the sample proportion.  相似文献   

以《海峡都市报》与泉州沃尔德营销研究咨询公司“泉州大学生就业期待”调查问卷为基础,从泉州大学生就业期待视角研究和分析了泉州民企的雇主品牌建设问题,并就泉州民企今后的雇主品牌建设问题提出了四点建议。  相似文献   

Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model. The power is related to the item response function (IRF) for the studied item, the latent trait distributions, and the sample sizes for the reference and focal groups. Simulation studies show that the theoretical values calculated from the formulas derived in the article are close to what are observed in the simulated data when the assumptions are satisfied. The robustness of the power formulas are studied with simulations when the assumptions are violated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号