首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Many statistics used in the assessment of differential item functioning (DIF) in polytomous items yield a single item-level index of measurement invariance that collapses information across all response options of the polytomous item. Utilizing a single item-level index of DIF can, however, be misleading if the magnitude or direction of the DIF changes across the steps underlying the polytomous response process. A more comprehensive approach to examining measurement invariance in polytomous item formats is to examine invariance at the level of each step of the polytomous item, a framework described in this article as differential step functioning (DSF). This article proposes a nonparametric DSF estimator that is based on the Mantel-Haenszel common odds ratio estimator ( Mantel & Haenszel, 1959 ), which is frequently implemented in the detection of DIF in dichotomous items. A simulation study demonstrated that when the level of DSF varied in magnitude or sign across the steps underlying the polytomous response options, the DSF-based approach typically provided a more powerful and accurate test of measurement invariance than did corresponding item-level DIF estimators.  相似文献   

2.
In this article, I address two competing conceptions of differential item functioning (DIF) in polytomously scored items. The first conception, referred to as net DIF, concerns between-group differences in the conditional expected value of the polytomous response variable. The second conception, referred to as global DIF, concerns the conditional dependence of group membership and the polytomous response variable. The distinction between net and global DIF is important because different DIF evaluation methods are appropriate for net and global DIF; no currently available method is universally the best for detecting both net and global DIF. Net and global DIF definitions are presented under two different, yet compatible, modeling frameworks: a traditional item response theory (IRT) framework, and a differential step functioning (DSF) framework. The theoretical relationship between the IRT and DSF frameworks is presented. Available methods for evaluating net and global DIF are described, and an applied example of net and global DIF is presented.  相似文献   

3.
Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item‐level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has recently been proposed, whereby measurement invariance is examined within each step underlying the polytomous response variable. The examination of DSF can provide valuable information concerning the nature of the DIF effect (i.e., is the DIF an item‐level effect or an effect isolated to specific score levels), the location of the DIF effect (i.e., precisely which score levels are manifesting the DIF effect), and the potential causes of a DIF effect (i.e., what properties of the item stem or task are potentially biasing). This article presents a didactic overview of the DSF framework and provides specific guidance and recommendations on how DSF can be used to enhance the examination of DIF in polytomous items. An example with real testing data is presented to illustrate the comprehensive information provided by a DSF analysis.  相似文献   

4.
随着多级计分在心理和教育领域中日益广泛的应用,对检验项目功能差异(DIF)的方法提出新的挑战。已有研究表明,在检验DIF的方法中,MIMIC是一种经济有效的检验方法,然而还没有研究系统地分析MIMIC方法在多级计分项目中的有效性。本研究通过蒙特卡洛实验,探讨参照组与目标组的样本容量、DIF类别、项目区分度、组间能力差异和在锚题中存在的DIF题量5个因素,并在这些因素不同情况的组合中分析MIMIC方法的第一类错误率和检验力。研究发现:1)MIMIC是一种能够灵敏地检验一致性DIF的方法,即使在目标组样本容量较小或明显小于参照组的情况下,它仍然能很好地控制第一类错误率;2)纯化步骤对MIMIC方法控制第一类错误率、提高检验力是有必要的,但MIMIC方法对污染程度又有一定的容忍性;3)检验力受到低区分度的严重影响,但太高的区分度又会导致第一类错误率的增加;4)MIMIC方法对一致性DIF的检验力随着样本容量的增大而增大。  相似文献   

5.
This study addresses the topic of how anchoring methods for differential item functioning (DIF) analysis can be used in multigroup scenarios. The direct approach would be to combine anchoring methods developed for two-group scenarios with multigroup DIF-detection methods. Alternatively, multiple tests could be carried out. The results of these tests need to be aggregated to determine the anchor for the final DIF analysis. In this study, the direct approach and three aggregation rules are investigated. All approaches are combined with a variety of anchoring methods, such as the “all-other purified” and “mean p-value threshold” methods, in two simulation studies based on the Rasch model. Our results indicate that the direct approach generally does not lead to more accurate or even to inferior results than the aggregation rules. The min rule overall shows the best trade-off between low false alarm rate and medium to high hit rate. However, it might be too sensitive when the number of groups is large. In this case, the all rule may be a good compromise. We also take a closer look at the anchor selection method “next candidate,” which performed rather poorly, and suggest possible improvements.  相似文献   

6.
This study examined the effect of sample size ratio and model misfit on the Type I error rates and power of the Difficulty Parameter Differences procedure using Winsteps. A unidimensional 30-item test with responses from 130,000 examinees was simulated and four independent variables were manipulated: sample size ratio (20/100/250/500/1000); model fit/misfit (1 PL and 3PLc =. 15 models); impact (no difference/mean differences/variance differences/mean and variance differences); and percentage of items with uniform and nonuniform DIF (0%/10%/20%). In general, the results indicate the importance of ensuring model fit to achieve greater control of Type I error and adequate statistical power. The manipulated variables produced inflated Type I error rates, which were well controlled when a measure of DIF magnitude was applied. Sample size ratio also had an effect on the power of the procedure. The paper discusses the practical implications of these results.  相似文献   

7.
本研究引入能够处理题组效应的项目功能差异检验方法,为篇章阅读测验提供更科学的DIF检验法。研究采用GMH法、P—SIBTEST法和P—LR法对中国汉语水平考试(HSK)(高等)阅读理解试题进行了DIF检验。结果表明,这三种方法的检验结果具有较高的一致性,该部分试题在性别与国别变量上不存在显著的DIF效应。本研究还将传统的DIF检验方法与变通的题组DIF检验方法进行了比较,结果表明后者具有明显的优越性。  相似文献   

8.
In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF using a combination of the decisions made with BD and MH to those of LR. The results revealed that while the Type I error rate of CDR was consistently below the nominal alpha level, the Type I error rate of LR was high for the conditions having unequal ability distributions. In addition, the power of CDR was consistently higher than that of LR across all forms of DIF.  相似文献   

9.
The purpose of this article is to describe and demonstrate a three-step process of using differential distractor functioning (DDF) in a post hoc analysis to understand sources of differential item functioning (DIF) in multiple-choice testing. The process is demonstrated on two multiple-choice tests that used complex alternatives (e.g., “No Mistakes”) as distractors. Comparisons were made between different gender and race groups. DIF analyses were conducted using Simultaneous Item Bias Test, whereas DDF analyses were conducted using loglinear model fitting and odds ratios. Five items made it through all three steps and were identified as those with DIF results related to DDF. Implications of the results, as well as suggestions for future research, are discussed.  相似文献   

10.
11.
DIF分析实际应用中的常见问题及其研究新进展   总被引:1,自引:0,他引:1  
多等级计分题、小样本、匹配变量不纯以及DIF检验后的原因分析是DIF检验面临的常见问题,对多等级计分题目进行DSF分析,小样本情况下DIF检测的平滑方法,匹配变量不纯情况下采用MIMIC法,以及运用Logistic模型进行DIF检验后的原因分析是DIF研究中的一些新进展。对这些进展的分析使我们相信,多种检验方法的配合使用、运用DIF研究进行多维IRT框架下的潜在变量探究等,都有可能使DIF研究成为测量学未来的基础研究领域之一。  相似文献   

12.
In many educational tests, both multiple‐choice (MC) and constructed‐response (CR) sections are used to measure different constructs. In many common cases, security concerns lead to the use of form‐specific CR items that cannot be used for equating test scores, along with MC sections that can be linked to previous test forms via common items. In such cases, adjustment by minimum discriminant information may be used to link CR section scores and composite scores based on both MC and CR sections. This approach is an innovative extension that addresses the long‐standing issue of linking CR test scores across test forms in the absence of common items in educational measurement. It is applied to a series of administrations from an international language assessment with MC sections for receptive skills and CR sections for productive skills. To assess the linking results, harmonic regression is applied to examine the effects of the proposed linking method on score stability, among several analyses for evaluation.  相似文献   

13.
14.
Two general item analysis indices which apply to multi-score items are developed as generalizations of a popular index applicable to dichotomous items. The indices of discrimination are of two types: one based on differential difficulty and the other on net number of positive discriminations. The usefulness and limitations of each are discussed.  相似文献   

15.
配送是电子商务的重要组成部分.根据有效的网络信息对大量的配送业务进行快速有效的处理是提高电子商务综合效益的关键环节.基于此提出了一种高效实用的配送业务处理方法,它能够根据当前有效路由和车辆等信息对大量的配送业务进行合理分组,从而提高电子商务中配送业务的整体效益.  相似文献   

16.
添设平行辅助线,利用平行截线产生的若干相似三角形,进行等比传递,是解三角形中线段比问题的基本思路和方法,这种方法原则上都是过三角形中任一线段的分点,作另一线段的平行线,再和第三线段相交,在A型和Z型图中列出两个比例式,然后进行等比传递.由于线段的分点可能较多,因而灵活多变,有多种解法.但这绝不能说这类题型有多解,从而误入一味追求不同解法的歧途,为什么呢?因为解题方法虽然很多,但思路只有一条,这种多解既不能培养发散思维能力,又无创造性价值.相反,若点选不好,平行线引导不恰当,不是计算量太大,就是理不清头绪,因此必需从多解中找到最优解,并总结快速获解规律.这样,教师容易教,学生容易学.  相似文献   

17.
从日常生活常见的一些有意违反同一律的现象出发,分门别类地阐述它的表现形式及其产生的喜剧效果,从而说明汉语在实际运用中的丰富多彩性.  相似文献   

18.
接受美学(又称接受理论)是本世纪六十年代后兴起的文学研究方法,这一理论将读者的接受和影响作为研究中心,一反以作家作品为研究中心的传统方法。本文试图运用接受学的理论于语文教学,从一个新的视点探索提高学生语文水平的途径。  相似文献   

19.
提高字词比是降低汉语、汉字学习难度的重要途径。但"字"、"词"及"字词比"定义不明确、研究过程及统计方法简单等诸多问题学界依旧语焉不详。针对现有字词比研究存在的不足,探析、明确字词比定义及其统计方法,并在此基础上以新《YCT考试大纲》1~4级词汇为研究对象,通过数据统计分析的方法探析新YCT大纲的字词比及其分布情况,并指出其字词比问题及原因,为《YCT考试大纲》进一步完善及其应用提供参考。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号