共查询到20条相似文献,搜索用时 109 毫秒
1.
2.
3.
基于项目反应理论中的LOGISTIC双参数模型研究共同题非等组设计下,考生能力分布与被试量对等值的影响。等值方法采用分别校准下的项目特征曲线法、Stocking-Lord法、Haebara法。等值结果采用等值分数标准误、等值系数标准误、共同题参数稳定性三种方法进行评价。研究结果表明,考生能力分布越接近,被试量越大,等值误差越小;且Stocking-Lord法较Haebara法的等值结果更稳定。 相似文献
4.
5.
本文采用共同题非等组设计,对五种基于IRT的项目参数等值方法进行比较研究。研究数据来自湖北某试点地区课改实验区和非课改区考生在标准化中考数学科目的考试数据,兼用大样本标准和其他标准作为各等值方法比较的检验标准,以RMSD指标作为操作性检验标准,利用STUIRT程序进行等值分析。研究结果表明,针对本研究所设置的等值情境,MS方法稳健性最差,对于项目难度参数的等值,同时校准方法最好,其次是SL特征曲线法,对于项目区分度参数的等值,MM方法精确性最好。 相似文献
6.
7.
8.
等值作为翻译标准和理想,相对于传统的翻译标准更加准确、具体,它有助于使信息量化,更好地处理有关结构、意义、文化、认知和语用等问题。回顾了20世纪80年代以来国内翻译理论家们对翻译等值论的研究,对与其相关的有争议的热点问题进行了梳理分析,如等值的界定、等值的可行性和相对性等等,同时在此基础上分析了对中国翻译研究的启示以及有关翻译等值研究的未来走向。 相似文献
9.
论文化的翻译等值层次 总被引:2,自引:0,他引:2
肖强 《内江师范学院学报》2004,19(1):52-55
通过对文化的可译度,即可译文化、部分可译文化和不可译文化的研究表明,由于不同国家在语言习惯和文化等方面的不等值,因此翻译等值只能是近似的等值。文化的翻译等值层次,首先应该达到文化语义层次的翻译等值.然后才是文化语言形式层次的等值。 相似文献
10.
11.
本文使用R-2.15.2软件模拟研究锚测验难度参数方差特征对测验等值误差的影响,采用三种等值方法(链百分位等值法、Levine等值法和Tucker等值法)对锚测验不同类型的难度方差进行比较研究。结果显示,当锚测验难度方差小于全测验难度方差时,其等值的随机误差和系统误差与锚测验难度方差和全测验难度方差一致时(即锚测验为全测验的平行缩减版minitest时)的表现基本相同。因此,对锚测验而言,要求其与全测验具有相同的统计规格可能过于严格。 相似文献
12.
13.
关于汉语水平考试等值设计的新思考 总被引:2,自引:0,他引:2
ZHANG Jinjun JING Libo 《中国考试》2008,(8)
汉语水平考试(HSK)实施多年来,一直坚持等值。在实际等值过程中,HSK遇到了一些新情况,旧的等值设计暴露出一些局限,变得难以适应。本文有针对性地提出了预测等值和跨国等值等新设计,以期应对新问题。 相似文献
14.
Tom Bramley 《Educational research; a review for teachers and all concerned with progress in education》2013,55(2):251-261
In setting the cut-scores on National Curriculum tests it is important to maintain standards. In the process of test development, both within and across years, changes are made to the style of the questions in order to increase their ‘accessibility’. This raises the question of whether a more accessible test should have higher cut-scores. Purely statistical definitions of equating are blind to differences between ‘accessibility’ and ‘easiness’ and cut-scores derived from statistical equating methods will be higher for a more accessible test. Arguments about the increased validity of the more accessible test are sometimes used to justify not raising the cut-scores as much as would be indicated by statistical methods. These arguments are shown to be equivalent to postulating that changing the accessibility is changing the construct measured by the test. Using a statistical measurement model can provide a rational basis for understanding accessibility and identifying types of question where accessibility issues are causing a measurement problem. 相似文献
15.
Because parameter estimates from different calibration runs under the IRT model are linearly related, a linear equation can
convert IRT parameter estimates onto another scale metric without changing the probability of a correct response (Kolen &
Brennan, 1995, 2004). This study was designed to explore a new approach to finding a linear equation by fixing C-parameters
for anchor items in IRT equating. A rationale for fixing C-parameters for anchor items in IRT equating can be established
from the fact that the C-parameters are not affected by any linear transformation. This new approach can avoid the difficulty
in getting accurate C-parameters for anchor items embedded in the application of the IRT model. Based upon our findings in
this study, we would recommend using the new approach to fix C-parameters for anchor items in IRT equating.
This work was supported by a Korea Research Foundation Grant funded by the Korean Government (MOEHRD, Basic Research 相似文献
16.
17.
18.
近年来关于DINA模型的相关研究显示,样本量、先验分布、经验贝叶斯或完全贝叶斯估计方法、样本的代表性、项目功能差异和Q阵误指等,均可能是导致DINA项目参数估计发生偏差的原因。使用Monte Carlo模拟试验,对DINA项目参数(猜测参数和失误参数)的组合变化类型和偏差量进行考察,通过条件极大似然估计法估计知识状态,发现项目参数估计值与真值偏差不大时,对知识状态估计的精度影响不大;但是项目参数偏离真值较大时,尤其是在三种组合类型上,对属性掌握存在明显的高估或低估现象。研究结果对于诊断测验等值有一定的启示:若两个测验上锚题的项目参数出现了较大的偏差(0.1),则需要考虑等值的必要性。 相似文献
19.
《教育实用测度》2013,26(4):383-407
The performance of the item response theory (IRT) true-score equating method is examined under conditions of test multidimensionality. It is argued that a primary concern in applying unidimensional equating methods when multidimensionality is present is the potential decrease in equity (Lord, 1980) attributable to the fact that examinees of different ability are expected to obtain the same test scores. In contrast to equating studies based on real test data, the use of simulation in equating research not only permits assessment of these effects but also enables investigation of hypothetical equating conditions in which multidimensionality can be suspected to be especially problematic for test equating. In this article, I investigate whether the IRT true-score equating method, which explicitly assumes the item response matrix is unidimensional, is more adversely affected by the presence of multidimensionality than 2 conventional equating methods-linear and equipercentile equating-using several recently proposed equity-based criteria (Thomasson, 1993). Results from 2 simulation studies suggest that the IRT method performs at least as well as the conventional methods when the correlation between dimensions is high (³ 0.7) and may be only slightly inferior to the equipercentile method when the correlation is moderate to low (£ 0.5). 相似文献
20.
Accurate equating results are essential when comparing examinee scores across exam forms. Previous research indicates that equating results may not be accurate when group differences are large. This study compared the equating results of frequency estimation, chained equipercentile, item response theory (IRT) true‐score, and IRT observed‐score equating methods. Using mixed‐format test data, equating results were evaluated for group differences ranging from 0 to .75 standard deviations. As group differences increased, equating results became increasingly biased and dissimilar across equating methods. Results suggest that the size of group differences, the likelihood that equating assumptions are violated, and the equating error associated with an equating method should be taken into consideration when choosing an equating method. 相似文献