首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This article introduces longitudinal multistage testing (lMST), a special form of multistage testing (MST), as a method for adaptive testing in longitudinal large‐scale studies. In lMST designs, test forms of different difficulty levels are used, whereas the values on a pretest determine the routing to these test forms. Since lMST allows for testing in paper and pencil mode, lMST may represent an alternative to conventional testing (CT) in assessments for which other adaptive testing designs are not applicable. In this article the performance of lMST is compared to CT in terms of test targeting as well as bias and efficiency of ability and change estimates. Using a simulation study, the effect of the stability of ability across waves, the difficulty level of the different test forms, and the number of link items between the test forms were investigated.  相似文献   

2.
基于认知诊断自适应测试(CD-CAT)的教育测量技术能够为学生个性化学习提供帮助,有助于做到因材施教。目前我国已开展基于CD-CAT教育辅助系统的开发和使用,但与其他国家和地区相比较仍有差距。扩大教育测量专业人员队伍,加强CD-CAT在理论上的创新研究、在实践上的应用,开发更加适合个人、更加开放灵活的智能学习系统是我国教育测量的未来发展方向。  相似文献   

3.
The purpose of this paper is to describe the logic and identify key assumptions associated with making cognitive inferences using two attribute-based psychometric methods. The first method is Kikumi Tatsuoka's rule-space model. This model provides a strong point of reference for studying the nature of diagnostic inferences because it is important in the evolution of skills diagnostic testing and it is well documented. The second method is a new procedure called the attribute hierarchy method that was developed from the rule-space approach. Although the attribute hierarchy method shares many commonalities with rule space, it represents an extension by including an attribute hierarchy that serves as an explicit cognitive model of task performance designed to link psychometric practices with contemporary cognitive theories. In this paper, we describe and compare these two attribute-based psychometric methods and identify new directions for research and practice in skills diagnostic testing.  相似文献   

4.
This paper proposes two new item selection methods for cognitive diagnostic computerized adaptive testing: the restrictive progressive method and the restrictive threshold method. They are built upon the posterior weighted Kullback‐Leibler (KL) information index but include additional stochastic components either in the item selection index or in the item selection procedure. Simulation studies show that both methods are successful at simultaneously suppressing overexposed items and increasing the usage of underexposed items. Compared to item selection based upon (1) pure KL information and (2) the Sympson‐Hetter method, the two new methods strike a better balance between item exposure control and measurement accuracy. The two new methods are also compared with Barrada et al.'s (2008) progressive method and proportional method.  相似文献   

5.
介绍了在软件生存周期中软件测试的目的和方法,讨论了用等价划分法设计测试方案。  相似文献   

6.
A potential undesirable effect of multistage testing is differential speededness, which happens if some of the test takers run out of time because they receive subtests with items that are more time intensive than others. This article shows how a probabilistic response-time model can be used for estimating differences in time intensities and speed between subtests and test takers and detecting differential speededness. An empirical data set for a multistage test in the computerized CPA Exam was used to demonstrate the procedures. Although the more difficult subtests appeared to have items that were more time intensive than the easier subtests, an analysis of the residual response times did not reveal any significant differential speededness because the time limit appeared to be appropriate. In a separate analysis, within each of the subtests, we found minor but consistent patterns of residual times that are believed to be due to a warm-up effect, that is, use of more time on the initial items than they actually need.  相似文献   

7.
The purpose of this study was to evaluate the adequacy of three cognitive models, one developed by content experts and two generated from student verbal reports for explaining examinee performance on a grade 3 diagnostic mathematics test. For this study, the items were developed to directly measure the attributes in the cognitive model. The performance of each cognitive model was evaluated by examining its fit to different data samples: verbal report, total, high-, moderate-, and low ability using the Hierarchy Consistency Index (Cui & Leighton, 2009), a model-data fit index. This study utilized cognitive diagnostic assessments developed under the framework of construct-centered test design and analyzed using the Attribute Hierarchy Method (Gierl, Wang, & Zhou, 2008; Leighton, Gierl, & Hunka, 2004). Both the expert-based and the student-based cognitive models provided excellent fit to the verbal report and high ability samples, but moderate to poor fit to the total, moderate and low ability samples. Implications for cognitive model development for cognitive diagnostic assessment are discussed.  相似文献   

8.
The purpose of this study is to show the usefulness of cognitive diagnoses for remedial instruction. Cognitive diagnoses were done by an adaptive testing system using the rule-space methodology, which was developed by K. K. Tatsuoka and her associates (K. K. Tatsuoka, 1983, 1990; K. K. Tatsuoka & M. M. atsuoka, 1987; M. M. Tatsuoka & K. K. Tatsuoka, 1989). The results of the study strongly indicate that knowing students'knowledge states prior to remediation is very effective and that the rule-space method can effectively diagnose students' knowledge states and can point out ways for remediating their errors quickly with minimum effort. It is also found that the design of instructional units for remediation can be effectively guided by the rule-space model, because the determination of all possible knowledge states in a domain of interest, given an incidence matrix, is based on a partially ordered tree structure of knowledge states, which is equivalent to item-score patterns determined logically from the incidence matrix.  相似文献   

9.
The development of cognitive diagnostic‐computerized adaptive testing (CD‐CAT) has provided a new perspective for gaining information about examinees' mastery on a set of cognitive attributes. This study proposes a new item selection method within the framework of dual‐objective CD‐CAT that simultaneously addresses examinees' attribute mastery status and overall test performance. The new procedure is based on the Jensen‐Shannon (JS) divergence, a symmetrized version of the Kullback‐Leibler divergence. We show that the JS divergence resolves the noncomparability problem of the dual information index and has close relationships with Shannon entropy, mutual information, and Fisher information. The performance of the JS divergence is evaluated in simulation studies in comparison with the methods available in the literature. Results suggest that the JS divergence achieves parallel or more precise recovery of latent trait variables compared to the existing methods and maintains practical advantages in computation and item pool usage.  相似文献   

10.
Multistage tests are those in which sets of items are administered adaptively and are scored as a unit. These tests have all of the advantages of adaptive testing, with more efficient and precise measurement across the proficiency scale as well as time savings, without many of the disadvantages of an item-level adaptive test. As a seemingly balanced compromise between linear paper-and-pencil and item-level adaptive tests, development and use of multistage tests is increasing. This module describes multistage tests, including two-stage and testlet-based tests, and discusses the relative advantages and disadvantages of multistage testing as well as considerations and steps in creating such tests.  相似文献   

11.
通过对"冠词+名词(形容词)"表示类指的特征进行分类研究,深入探讨冠词在名词词组中表达类别功能的作用。冠词的指类特征与其基本属性有着密切的关系,特别是在"the+形容词"结构中,冠词促使形容词名词化并赋予其类别语义特征。  相似文献   

12.
Many large-scale educational surveys have moved from linear form design to multistage testing (MST) design. One advantage of MST is that it can provide more accurate latent trait (θ) estimates using fewer items than required by linear tests. However, MST generates incomplete response data by design; hence, questions remain as to how to calibrate items using the incomplete data from MST design. Further complication arises when there are multiple correlated subscales per test, and when items from different subscales need to be calibrated according to their respective score reporting metric. The current calibration-per-subscale method produced biased item parameters, and there is no available method for resolving the challenge. Deriving from the missing data principle, we showed when calibrating all items together the Rubin's ignorability assumption is satisfied such that the traditional single-group calibration is sufficient. When calibrating items per subscale, we proposed a simple modification to the current calibration-per-subscale method that helps reinstate the missing-at-random assumption and therefore corrects for the estimation bias that is otherwise existent. Three mainstream calibration methods are discussed in the context of MST, they are the marginal maximum likelihood estimation, the expectation maximization method, and the fixed parameter calibration. An extensive simulation study is conducted and a real data example from NAEP is analyzed to provide convincing empirical evidence.  相似文献   

13.
This article describes a computerized adaptive test (CAT) based on the uniform item exposure multi-form structure (uMFS). The uMFS is a specialization of the multi-form structure (MFS) idea described by Armstrong, Jones, Berliner, and Pashley (1998 Armstrong, R. D., Jones, D. H., Berliner, N. and Pashley, P. June 1998. Computerized adaptive tests with multiple form structures, June, Champaign-Urbana, IL: Paper presented at the annual meeting of the Psychometric Society.  [Google Scholar]). In an MFS CAT, the examinee first responds to a small fixed block of items. The items comprising that block may be unrelated to each other, or they may comprise a testlet (Wainer and Kiely, 1987 Wainer, H. and Kiely, G. L. 1987. Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24: 185201. [Crossref], [Web of Science ®] [Google Scholar]) After the first block of items has been administered, adaptation takes place in the choice of the next block to be administered and subsequent blocks. The uMFS design integrates item exposure control, as well as content balancing and other test development needs, into the design of the CAT, instead of placing those activities in the online implementation. We show that it is possible to implement item exposure control, in a very thorough way, in the psychometric specifications of the item blocks.  相似文献   

14.
计算机化多阶段测验(MST)主要由阶段和模块组成,在这样的框架结构下,计算机化多阶段测验不仅具备适应性考试的优点,而且能够发挥专家的智慧。本文介绍MST结构的特点和选择,梳理MST结构的相关研究。未来研究应进一步关注题库、考生能力分布等影响因素下的结构比较,并深入探讨分类测验的MST和多维MST的基本结构。  相似文献   

15.
This article reviews four interrelated approaches to reducing an inequitable gap in cognitive and educational test scores between individuals of a dominant culture and individuals of other cultures or subcultures. These approaches include (a) use of broader measures, (b) performance- and project-based assessments, (c) direct measurement of knowledge and skills relevant to environmental adaptation, and (d) dynamic assessment. It is concluded that when appropriate assessment is done that recognizes students’ diverse cultural and social backgrounds, equity can increase, predictive validity of cognitive and educational tests can increase, and at the same time, racial/ethnic/culture differences can decrease.  相似文献   

16.
Cognitive diagnosis models (CDMs) have been developed to evaluate the mastery status of individuals with respect to a set of defined attributes or skills that are measured through testing. When individuals are repeatedly administered a cognitive diagnosis test, a new class of multilevel CDMs is required to assess the changes in their attributes and simultaneously estimate the model parameters from the different measurements. In this study, the most general CDM of the generalized deterministic input, noisy “and” gate (G‐DINA) model was extended to a multilevel higher order CDM by embedding a multilevel structure into higher order latent traits. A series of simulations based on diverse factors was conducted to assess the quality of the parameter estimation. The results demonstrate that the model parameters can be recovered fairly well and attribute mastery can be precisely estimated if the sample size is large and the test is sufficiently long. The range of the location parameters had opposing effects on the recovery of the item and person parameters. Ignoring the multilevel structure in the data by fitting a single‐level G‐DINA model decreased the attribute classification accuracy and the precision of latent trait estimation. The number of measurement occasions had a substantial impact on latent trait estimation. Satisfactory model and person parameter recoveries could be achieved even when assumptions of the measurement invariance of the model parameters over time were violated. A longitudinal basic ability assessment is outlined to demonstrate the application of the new models.  相似文献   

17.
本文根据英语诊断性测试的现状,论述开展诊断性测试的重要意义、测试类型、测试形式、编制原则和操作环节,以期有助于对学生的英语学习过程进行及时的干预。  相似文献   

18.
认知诊断测试可以反映受试的知识结构和分项技能掌握情况,为受试提供详细的反馈信息.本文简要介绍了认知诊断的原理和步骤,总结了国内外英语测试领域的认知诊断研究已取得的进展,并指出目前该领域尚存的问题远大于已取得的成就,在未来的研究中需要设计严格意义上的认知诊断测试,探索检验Q矩阵效度的多种方法并开展诊断结果促学的实证研究.  相似文献   

19.
余娜  辛涛 《考试研究》2009,(3):22-34
认知诊断理论是基于项目反应理论的新一代测量理论,在教育测量实践中具有广阔的应用前景。诊断理论的研究主要围绕诊断模型的提出、模型诊断性能的评估、模型诊断结果的报告三个方面展开。认知诊断研究在上述三个方面的进展促进了诊断模型理论建设的深入与应用范围的拓展,但是在模型的外在效度、模型的群体诊断结果、模型的选择与比较、多分项目的诊断模型以及不同诊断测验之间的等值方面仍需进一步研究探索。  相似文献   

20.
认知诊断评估是教育和心理测量学界新兴起的一项技术,主要是在传统考试基础上,为学生提供诊断信息,即提供关于学生对知识/技能掌握情况的信息,对教育教学具有重要的实际价值。本文以认知诊断评估为研究对象,着重分析开展认知诊断评估工作的基本流程,以期为我国教育工作开展认知诊断评估提供参考。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号