首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A goal for any linking or equating of two or more tests is that the linking function be invariant to the population used in conducting the linking or equating. Violations of population invariance in linking and equating jeopardize the fairness and validity of test scores, and pose particular problems for test‐based accountability programs that require schools, districts, and states to report annual progress on academic indicators disaggregated by demographic group membership. This instructional module provides a comprehensive overview of population invariance in linking and equating and the relevant methodology developed for evaluating violations of invariance. A numeric example is used to illustrate the comparative properties of available methods, and important considerations for evaluating population invariance in linking and equating are presented.  相似文献   

2.
The Non-Equivalent-groups Anchor Test (NEAT) design has been in wide use since at least the early 1940s. It involves two populations of test takers, P and Q, and makes use of an anchor test to link them. Two linking methods used for NEAT designs are those (a) based on chain equating and (b) that use the anchor test to post-stratify the distributions of the two operational test scores to a common population (i.e., Tucker equating and frequency estimation). We show that, under different sets of assumptions, both methods are observed score equating methods and we give conditions under which the methods give identical results. In addition, we develop analogues of the Dorans and Holland (2000) RMSD measures of population invariance of equating methods for the NEAT design for both chain and post-stratification equating methods.  相似文献   

3.
项目反应理论框架下新的基于题库的大型测验的等值设计:等值到题库设计(ETP设计),与其他传统等值设计相比,可以避免传统共同组设计和共同题设计的一些缺点,并能够在保证等值精度的情况下对测验进行等值。在目前许多大型考试已有题库的情况下,ETP设计具有较大的发展空间。  相似文献   

4.
The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different administrations of tests from a licensure testing program. We investigated the chained linear, Tucker, Levine, and mean equating methods, along with the identity and the synthetic functions with small samples (N = 19 to 70). The synthetic function did not perform as well as did other linear equating methods because test forms differed markedly in difficulty; thus, the use of the identity function produced substantial bias. The effectiveness of the synthetic function depended on the forms' similarity in difficulty.  相似文献   

5.
This study addressed the sampling error and linking bias that occur with small samples in a nonequivalent groups anchor test design. We proposed a linking method called the synthetic function, which is a weighted average of the identity function and a traditional equating function (in this case, the chained linear equating function). Specifically, we compared the synthetic, identity, and chained linear functions for various‐sized samples from two types of national assessments. One design used a highly reliable test and an external anchor, and the other used a relatively low‐reliability test and an internal anchor. The results from each of these methods were compared to the criterion equating function derived from the total samples with respect to linking bias and error. The study indicated that the synthetic functions might be a better choice than the chained linear equating method when samples are not large and, as a result, unrepresentative.  相似文献   

6.
本研究对单组设计中平均数等值、线性等值和等百分位等值三种等值方法的群体不变性进行了探讨。研究数据来自中国汉语水平考试,考生按性别被分为不同的子群体。研究结果表明,平均数等值和线性等值两种方法在子群体和总体考生中的等值转换关系很接近,具有较好的群体不变性;而等百分位等值法在子群体和总体考生中的等值结果差异较大,群体不变性较差。  相似文献   

7.
8.
A resampling study was conducted to compare the statistical bias and standard errors of nonequivalent-groups linear test equating in small samples of examinees. Sample sizes of 15, 25, 50, and 100 were examined. One thousand samples of each size were drawn with replacement from each of 5 archival data files from teacher subject area tests. For each test, data files from 2 parallel forms were used. Results suggest trivial levels of equating bias even with small samples, but substantial increases in standard errors as sample size decreases. Results were interpreted in terms of applications to testing situations in which small numbers of examinees are available.  相似文献   

9.
从不变量看信息概念的定义   总被引:2,自引:0,他引:2  
申农的实用信息定义的最大不足是 ,其定量化描述依赖于接收者的主观知识状况 ,这与作为物质世界基本属性的信息所具有的客观性发生了严重分裂 ,同时它也无法对信息多种多样的用途给出一致说明。将信息概念与物质、质量、能量等概念相对比 ,那么 ,信息就是物质运动变化过程中的不变量 ,并且 ,信息的这种定义方式在其本质、度量以及守恒性等的讨论中已显示重要价值。  相似文献   

10.
How does the fact that two tests should not be equated manifest itself? This paper addresses this question through the study of the degree to which equating functions fail to exhibit population invariance across subpopulations. Equating fimctions are supposed to be population invariant by definition. But, when two tests are not equatable, it is possible that the linking functions, used to connect the scores of one to the scores of the other, are not invariant across different populations of examinees. While no acceptable equating function is ever completely population invariant, in the situations where equating is usually performed we believe that the dependence of the equating function on the population used to compute it is usually small enough to be ignored. We introduce two root‐mean‐square difference measures of the degree to which the functions used to link two tests computed on different subpopulations differ from the linking function computed for the whole population. We also introduce the system of “parallel‐linear” linking functions for multiple subpopulations and show that, for this system, our measure of population invariance can be computed easily from the standardized mean differences between the scores of the subpopulations on the two tests. For the parallel‐linear case, we develop a correlation‐based upper bound on our measure that holds for all systems of subpopulations. We illustrate these ideas using data from the SAT I and from a concordance study of several combinations of ACT and SAT I scores, In the appendices, we give some theoretical results bearing on the other equating “requirements” of “same construct,”“same reliability” and one aspect of Lord's concept of equity.  相似文献   

11.
This article presents a method for evaluating equating results. Within the kernel equating framework, the percent relative error (PRE) for chained equipercentile equating was computed under the nonequivalent groups with anchor test (NEAT) design. The method was applied to two data sets to obtain the PRE, which can be used to measure equating effectiveness. The study compared the PRE results for chained and poststratification equating. The results indicated that the chained method transformed the new form score distribution to the reference form scale more effectively than the poststratification method. In addition, the study found that in chained equating, the population weight had impact on score distributions over the target population but not on the equating and PRE results.  相似文献   

12.
郁贤皓教授的新著《李白与唐代文史考论》,作为“随园文库”的一种,2008年1月由南京师范大学出版社出版。全书分为三卷,第一卷为《李白丛考》,收录了著者20世纪70年代以来考证李白的文章30篇,涉及李白生平事迹、行踪、交游、作品系年、辑佚与辨伪,并订正前人在李白研究中的错误,重新勾勒出李白一生的新轮廓;第二卷为《李白论稿》,收录了研究李白的理论与评述文章30篇,  相似文献   

13.
Four equating methods (3PL true score equating, 3PL observed score equating, beta 4 true score equating, and beta 4 observed score equating) were compared using four equating criteria: first-order equity (FOE), second-order equity (SOE), conditional-mean-squared-error (CMSE) difference, and the equipercentile equating property. True score equating more closely achieved estimated FOE than observed score equating when the true score distribution was estimated using the psychometric model that was used in the equating. Observed score equating more closely achieved estimated SOE, estimated CMSE difference, and the equipercentile equating property than true score equating. Among the four equating methods, 3PL observed score equating most closely achieved estimated SOE and had the smallest estimated CMSE difference, and beta 4 observed score equating was the method that most closely met the equipercentile equating property.  相似文献   

14.
This study examined the appropriateness of the anchor composition in a mixed-format test, which includes both multiple-choice (MC) and constructed-response (CR) items, using subpopulation invariance indices. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using two types of anchor sets: (a) MC only and (b) a mix of MC and CR. In each anchor condition, the linking functions were also derived separately for males and females, and those subpopulation functions were compared to the total group function. In the MC-only condition, the difference between the subpopulation functions and the total group function was not trivial in a score region that included cut scores, leading to inconsistent pass/fail decisions for low-performing examinees in particular. Overall, the mixed anchor was a better choice than the MC-only anchor to achieve subpopulation invariance between males and females. The research reinforces subpopulation invariance indices as a means of determining the adequacy of the anchor.  相似文献   

15.
HSK是为测试母语为非汉语者(包括外国人和华侨)的汉语水平而设立的国家级标准化考试。MHK是专门测试母语为非汉语的中国少数民族汉语学习者汉语水平的国家级标准化考试。HSK和MHK都是证书考试。如果证书授予标准缺乏稳定性和公平性,如果对使用这一份试卷的人一个标准,对使用另一份试卷的人又一个标准,那么,不仅会大大影响HSK的信度和效度,而且会对有关的决策产生误导,会使考生受到不公平的对待。在HSK和MHK的开发和实施过程中,一直坚持了对考试分数的统计等值处理。在HSK和MHK的等值设计方面,我们综合采用了共同组等值、共同题等值和分半组合的混合设计。在HSK和MHK的等值数据处理方面,我们综合采用了线性等值、等百分位等值和IRT等值。本文介绍了HSK和MHK的等值方法。讨论了各种方法的得失,讨论了今后继续改进的可能性。  相似文献   

16.
流域为自然地理区域类型之一,历史流域问题以特定流域空间为范围,具有时间和空间所构成的立体维度。历史流域问题各种各样,但具有一些基本属性,即具有系统性、特殊性和差异性等。历史流域学是以历史流域问题为研究对象的区域地理学与历史地理学之下的区域历史地理的分支学科,其形成发展具有一定的必然性,目前作为独立分支学科已初具雏形。基于历史流域问题的本质属性,历史流域学研究应吸收借鉴先进的理论方法,秉持科学的学术研究观念。历史流域学研究应以历史时期人地关系的分析为核心,秉持可持续发展的科学观念、整体研究与综合研究的观念以及多学科与跨学科研究的观念。  相似文献   

17.
18.
《人口与历史》一书有三大特点值得肯定;一、长时段的研究视野,这种方法能使人们越过暂时而肤浅的、偶然的事件,以一种“大历史”的视角去发现历史运行的整体态势;二、结构主义的分析方法,为人们提供了一幅生动的立体的人口与历史关系的画卷;三、分析、考订史料的扎实功力。  相似文献   

19.
20.
教育增长与教育发展:历史、概念与政策   总被引:2,自引:0,他引:2  
60年来,发展理论经历了从经济增长到经济发展再到经济、社会、人、自然全面协调可持续发展的变迁。我国教育正面临大改革和大发展。这一过程中,人们容易把教育增长即教育人数的增加、教育年限的延长和教育资源条件的改善当作教育发展,从而使教育发展面临GDP化、工业化、市场化和物质化的陷阱。而教育发展是指教育系统自身机能的改善和对社会教育需求的灵活满足,它包括教育的物质基础、结构、功能、教育机会数量、教育机会分配、受教育者素质变化、社会人力资源状况以及对社会的影响八个范畴。为了促进教育的发展,当前应在科学教育发展观的指导下,检视既有的教育发展道路,调整教育政策的价值标准、知识基础、政策领域和政策手段。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号