首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The top‐down approach to designing a multistage test is relatively understudied in the literature and underused in research and practice. This study introduced a route‐based top‐down design approach that directly sets design parameters at the test level and utilizes the advanced automated test assembly algorithm seeking global optimality. The design process in this approach consists of five sub‐processes: (1) route mapping, (2) setting objectives, (3) setting constraints, (4) routing error control, and (5) test assembly. Results from a simulation study confirmed that the assembly, measurement and routing results of the top‐down design eclipsed those of the bottom‐up design. Additionally, the top‐down design approach provided unique insights into design decisions that could be used to refine the test. Regardless of these advantages, it is recommended applying both top‐down and bottom‐up approaches in a complementary manner in practice.  相似文献   

2.
Analyzing examinees’ responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study explored the effectiveness of the Wald test in detecting both uniform and nonuniform DIF in the DINA model through a simulation study. Results of this study suggest that for relatively discriminating items, the Wald test had Type I error rates close to the nominal level. Moreover, its viability was underscored by the medium to high power rates for most investigated DIF types when DIF size was large. Furthermore, the performance of the Wald test in detecting uniform DIF was compared to that of the traditional Mantel‐Haenszel (MH) and SIBTEST procedures. The results of the comparison study showed that the Wald test was comparable to or outperformed the MH and SIBTEST procedures. Finally, the strengths and limitations of the proposed method and suggestions for future studies are discussed.  相似文献   

3.
This article used the Wald test to evaluate the item‐level fit of a saturated cognitive diagnosis model (CDM) relative to the fits of the reduced models it subsumes. A simulation study was carried out to examine the Type I error and power of the Wald test in the context of the G‐DINA model. Results show that when the sample size is small and a larger number of attributes are required, the Type I error rate of the Wald test for the DINA and DINO models can be higher than the nominal significance levels, while the Type I error rate of the A‐CDM is closer to the nominal significance levels. However, with larger sample sizes, the Type I error rates for the three models are closer to the nominal significance levels. In addition, the Wald test has excellent statistical power to detect when the true underlying model is none of the reduced models examined even for relatively small sample sizes. The performance of the Wald test was also examined with real data. With an increasing number of CDMs from which to choose, this article provides an important contribution toward advancing the use of CDMs in practical educational settings.  相似文献   

4.
In automated test assembly (ATA), the methodology of mixed‐integer programming is used to select test items from an item bank to meet the specifications for a desired test form and optimize its measurement accuracy. The same methodology can be used to automate the formatting of the set of selected items into the actual test form. Three different cases are discussed: (i) computerized test forms in which the items are presented on a screen one at a time and only their optimal order has to be determined; (ii) paper forms in which the items need to be ordered and paginated and the typical goal is to minimize paper use; and (iii) published test forms with the same requirements but a more sophisticated layout (e.g., double‐column print). For each case, a menu of possible test‐form specifications is identified, and it is shown how they can be modeled as linear constraints using 0–1 decision variables. The methodology is demonstrated using two empirical examples.  相似文献   

5.
在认知诊断模型中进行题目功能差异(DIF)的检测,目的在于保证测验的质量与效果。在以往研究的基础上,本研究重点探索在CDMs框架下,MH、LR、CSIBTEST、WObs、WSw、WXPD 6种DIF检测方法在Q矩阵是否正确设定以及有关DIF影响因素等条件下的表现。结果表明:在Q矩阵正确设定时,WObs、WSw和WXPD统计量表现要好于MH、LR和CSIBTEST方法;在Q矩阵错误设定时,6种方法都会出现Ⅰ类错误率膨胀和统计检验力较低的现象。相对而言,MH、LR和CSIBTEST方法的表现比较稳定,WObs、WSw和WXPD统计量的表现变化较大,WObs、WSw和WXPD统计量的Ⅰ类错误率和统计检验力的结果依然好于MH、LR、CSIBTEST方法。  相似文献   

6.
The assessment of differential item functioning (DIF) is routinely conducted to ensure test fairness and validity. Although many DIF assessment methods have been developed in the context of classical test theory and item response theory, they are not applicable for cognitive diagnosis models (CDMs), as the underlying latent attributes of CDMs are multidimensional and binary. This study proposes a very general DIF assessment method in the CDM framework which is applicable for various CDMs, more than two groups of examinees, and multiple grouping variables that are categorical, continuous, observed, or latent. The parameters can be estimated with Markov chain Monte Carlo algorithms implemented in the freeware WinBUGS. Simulation results demonstrated a good parameter recovery and advantages in DIF assessment for the new method over the Wald method.  相似文献   

7.
针对目前解决制动器试验台控制问题的算法复杂度高,近似解不能很好地逼近于最优解等问题,提出了一种基于自适应算法的制动器试验台控制方法.该方法将制动器试验台的控制问题转化到对电流的控制问题上,运用自适应算法的自我适应与自我调整等特性,以路试的能量变化与机试的能量变化的差值作为调整的准则,形成一个反馈系统,从而调整电流的大小,使机试逐渐逼近于路试.通过Matlab软件仿真使能量相对误差小于3%,表明该方法能使相对误差尽可能的小,是一种解决制动器试验台控制问题的有效方法.  相似文献   

8.
Compared to unidimensional item response models (IRMs), cognitive diagnostic models (CDMs) based on latent classes represent examinees' knowledge and item requirements using discrete structures. This study systematically examines the viability of retrofitting CDMs to IRM‐based data with a linear attribute structure. The study utilizes a procedure to make the IRM and CDM frameworks comparable and investigates how estimation accuracy is affected by test diagnosticity and the match between the true and fitted models. The study shows that comparable results can be obtained when highly diagnostic IRM data are retrofitted with CDM, and vice versa, retrofitting CDMs to IRM‐based data in some conditions can result in considerable examinee misclassification, and model fit indices provide limited indication of the accuracy of item parameter estimation and attribute classification.  相似文献   

9.
非洲菊的引种栽培研究   总被引:1,自引:1,他引:0  
采用不同配比的栽培基质及施肥方式栽培7个品种的非洲菊,结果表明热带草原、阳光海岸、太阳风和紫衣4个品种适宜在本地种植.蘑菇渣(1/3)∶堆土(1/3)∶珍珠岩(1/3)栽培的非洲菊的生长较旺盛、缺硼症比例最低.以复合肥做底肥时,非洲菊的长势超过对照,缺硼症比例低于对照.故利用传统的堆土栽培非洲菊,可施用复合肥提高基质的肥力,并添加珍珠岩改善基质的物理结构.  相似文献   

10.
For a surface mounting machine(SMM)in printed circuit board(PCB)assembly line,there are four problems,e.g. CAD data conversion,nozzle selection,feeder assignment and placement sequence determination. A hierarchical planning for them to maximize the throughput rate of an SMM is presented here. To minimize set-up time,a CAD data conversion system was first applied that could automatically generate the data for machine placement from CAD design data files. Then an effective nozzle selection approach was implemented to minimize the time of nozzle changing. And then,to minimize picking time,an algorithm for feeder assignment was used to make picking multiple components simultaneously as much as possible. Finally,in order to shorten pick-and-place time,a heuristic algorithm was used to determine optimal component placement sequence according to the decided feeder positions. Experiments were conducted on a four head SMM.The experimental results were used to analyse the assembly line performance.  相似文献   

11.
In cognitive diagnostic models (CDMs), a set of fine-grained attributes is required to characterize complex problem solving and provide detailed diagnostic information about an examinee. However, it is challenging to ensure reliable estimation and control computational complexity when The test aims to identify the examinee's attribute profile in a large-scale map of attributes. To address this problem, this study proposes a cognitive diagnostic multistage testing by partitioning hierarchically structured attributes (CD-MST-PH) as a multistage testing for CDM. In CD-MST-PH, multiple testlets can be constructed based on separate attribute groups before testing occurs, which retains the advantages of multistage testing over fully adaptive testing or the on-the-fly approach. Moreover, testlets are offered sequentially and adaptively, thus improving test accuracy and efficiency. An item information measure is proposed to compute the discrimination power of an item for each attribute, and a module assembly method is presented to construct modules anchored at each separate attribute group. Several module selection indices for CD-MST-PH are also proposed by modifying the item selection indices used in cognitive diagnostic computerized adaptive testing. The results of simulation study show that CD-MST-PH can improve test accuracy and efficiency relative to the conventional test without adaptive stages.  相似文献   

12.
Cognitive diagnosis models (CDMs) continue to generate interest among researchers and practitioners because they can provide diagnostic information relevant to classroom instruction and student learning. However, its modeling component has outpaced its complementary component??test construction. Thus, most applications of cognitive diagnosis modeling involve retrofitting of CDMs to assessments constructed using classical test theory (CTT) or item response theory (IRT). This study explores the relationship between item statistics used in the CTT, IRT, and CDM frameworks using such an assessment, specifically a large-scale mathematics assessment. Furthermore, by highlighting differences between tests with varying levels of diagnosticity using a measure of item discrimination from a CDM approach, this study empirically uncovers some important CTT and IRT item characteristics. These results can be used to formulate practical guidelines in using IRT- or CTT-constructed assessments for cognitive diagnosis purposes.  相似文献   

13.
This paper describes a procedure for automated test forms assembly based on Classical Test Theory (CTT). The procedure uses stratified random content sampling and test form pre-equating to ensure both content and psychometric equivalence in generating virtually unlimited parallel forms. The procedure extends the usefulness of CTT in automated test form construction, yielding classical item statistics based on representative sample distributions and pre-equated test forms with known psychometric characteristics. A rationale for the procedure is presented followed by an example application and discussion of psychometric considerations related to its use.  相似文献   

14.
《教育实用测度》2013,26(3):203-205
Many credentialing agencies today are either administering their examinations by computer or are likely to be doing so in the coming years. Unfortunately, although several promising computer-based test designs are available, little is known about how well they function in examination settings. The goal of this study was to compare fixed-length examinations (both operational forms and newly constructed forms) with several variations of multistage test designs for making pass-fail decisions. Results were produced for 3 passing scores. Four operational 60-item examinations were compared to (a) 3 new 60-item forms, (b) 60-item 3-stage tests, and (c) 40-item 2-stage tests; all were constructed using automated test assembly software. The study was carried out using computer simulation techniques that were set to mimic common examination practices. All 60-item tests, regardless of design or passing score, produced accurate ability estimates and acceptable and similar levels of decision consistency and decision accuracy. One interesting finding was that the 40-item test results were poorer than the 60-item test results, as expected, but were in the range of acceptability. This raises the practical policy question of whether content-valid 40-item tests with lower item exposure levels and/or savings in item development costs are an acceptable trade-off for a small loss in decision accuracy and consistency.  相似文献   

15.
In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. Automated scoring is the use of computer algorithms to score unconstrained open-ended test items by mimicking human scoring. The use of automated scoring is increasing in educational assessment programs because it allows scores to be returned faster at lower cost. In the module, they discuss automated scoring from a number of perspectives. First, they discuss benefits and weaknesses of automated scoring, and what psychometricians should know about automated scoring. Next, they describe the overall process of automated scoring, moving from data collection to engine training to operational scoring. Then, they describe how automated scoring systems work, including the basic functions around score prediction as well as other flagging methods. Finally, they conclude with a discussion of the specific validity demands around automated scoring and how they align with the larger validity demands around test scores. Two data activities are provided. The first is an interactive activity that allows the user to train and evaluate a simple automated scoring engine. The second is a worked example that examines the impact of rater error on test scores. The digital module contains a link to an interactive web application as well as its R-Shiny code, diagnostic quiz questions, activities, curated resources, and a glossary.  相似文献   

16.
A Note on the Invariance of the DINA Model Parameters   总被引:1,自引:0,他引:1  
Cognitive diagnosis models (CDMs), as alternative approaches to unidimensional item response models, have received increasing attention in recent years. CDMs are developed for the purpose of identifying the mastery or nonmastery of multiple fine-grained attributes or skills required for solving problems in a domain. For CDMs to receive wider use, researchers and practitioners need to understand the basic properties of these models. The article focuses on one CDM, the deterministic inputs, noisy "and" gate (DINA) model, and the invariance property of its parameters. Using simulated data involving different attribute distributions, the article demonstrates that the DINA model parameters are absolutely invariant when the model perfectly fits the data. An additional example involving different ability groups illustrates how noise in real data can contribute to the lack of invariance in these parameters. Some practical implications of these findings are discussed .  相似文献   

17.
Cognitive diagnosis models (CDMs) have been developed to evaluate the mastery status of individuals with respect to a set of defined attributes or skills that are measured through testing. When individuals are repeatedly administered a cognitive diagnosis test, a new class of multilevel CDMs is required to assess the changes in their attributes and simultaneously estimate the model parameters from the different measurements. In this study, the most general CDM of the generalized deterministic input, noisy “and” gate (G‐DINA) model was extended to a multilevel higher order CDM by embedding a multilevel structure into higher order latent traits. A series of simulations based on diverse factors was conducted to assess the quality of the parameter estimation. The results demonstrate that the model parameters can be recovered fairly well and attribute mastery can be precisely estimated if the sample size is large and the test is sufficiently long. The range of the location parameters had opposing effects on the recovery of the item and person parameters. Ignoring the multilevel structure in the data by fitting a single‐level G‐DINA model decreased the attribute classification accuracy and the precision of latent trait estimation. The number of measurement occasions had a substantial impact on latent trait estimation. Satisfactory model and person parameter recoveries could be achieved even when assumptions of the measurement invariance of the model parameters over time were violated. A longitudinal basic ability assessment is outlined to demonstrate the application of the new models.  相似文献   

18.
Both classical test theory and generalizability theory focus on measurement error as a group property. Thus, common estimates o f errors o f measurement are developed for all members o f a group. But with behavioral data, unlike specifically test data, it is sometimes possible to estimate error separately for each individual. This enables one to ask questions about the relationships between error o f measurement and other characteristics o f the individual. Consequently, it also makes possible the use of regression techniques to call upon group data to improve the estimates o f individual measurement error. The focus on the individual also lays bare the possibility o f sequencing effects, and it is shown that, even in the absence o f trend, autocorrelation can cause standard procedures to grossly underestimate the magnitude o f measurement error. Classroom observation data are examined for autocorrelation, and recommendations are made about the scheduling o f data collection so as to minimize its effects.  相似文献   

19.
针对复杂机械装配关键系统的拆卸序列优化问题,建立了拆卸序列规划模型,提出了一种改进的双种群遗传算法。通过拆卸混合图的表达拆卸序列信息,在改进的遗传算法中利用优先约束矩阵生成TOP序列种群,以拆卸时间最少为优化目标对拆卸序列进行优化。某企业生产的装载机变速箱的再制造拆卸序列优化,进一步验证了该算法的有效性和可行性。  相似文献   

20.
Abstract

The present study compared the performance of six cognitive diagnostic models (CDMs) to explore inter skill relationship in a reading comprehension test. To this end, item responses of about 21,642 test-takers to a high-stakes reading comprehension test were analyzed. The models were compared in terms of model fit at both test and item levels, classification consistency and accuracy, and proportion of skill mastery profiles. The results showed that the G-DINA performed the best and the C-RUM, NC-RUM, and ACDM showed the closest affinity to the G-DINA. In terms of some criteria, the DINA showed comparable performance to the G-DINA. The test-level results were corroborated by the item-level model comparison, where DINA, DINO, and ACDM variously fit some of the items. The results of the study suggested that relationships among the subskills of reading comprehension might be a combination of compensatory and non-compensatory. Therefore, it is suggested that the choice of the CDM be carried out at item level rather than test level.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号