期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bette Chambers Philip C. Abrami Katherine McWhaw Michel Charles Therrien 《Educational Research and Evaluation》2013,19(2-3):223-239

This paper describes the development and formative evaluation of a computer-assisted tutoring program to help students who experience problems learning to read. Based on a combination of social constructivist and behavioral theories, this program addresses the cost and quality issues associated with delivering high quality, cost-effective tutoring. Initial feedback on the beta version of the program provides support for developing a program designed with “just in time” support for tutors and interactive activities for tutees based on material that students are struggling with in their reading class. 相似文献

2.

The Impact of Vertical Scaling Decisions on Growth Interpretations

Derek C. Briggs Jonathan P. Weeks 《Educational Measurement》2009,28(4):3-14

Most growth models implicitly assume that test scores have been vertically scaled. What may not be widely appreciated are the different choices that must be made when creating a vertical score scale. In this paper empirical patterns of growth in student achievement are compared as a function of different approaches to creating a vertical scale. Longitudinal item‐level data from a standardized reading test are analyzed for two cohorts of students between Grades 3 and 6 and Grades 4 and 7 for the entire state of Colorado from 2003 to 2006. Eight different vertical scales were established on the basis of choices made for three key variables: Item Response Theory modeling approach, linking approach, and ability estimation approach. It is shown that interpretations of empirical growth patterns appear to depend upon the extent to which a vertical scale has been effectively “stretched” or “compressed” by the psychometric decisions made to establish it. While all of the vertical scales considered show patterns of decelerating growth across grade levels, there is little evidence of scale shrinkage. 相似文献

3.

L. Katherine Prenovost Kentaro Hayashi 《Structural equation modeling》2013,20(2):212-217

Statistical Models for Ordinal Variables. C. C. Clogg and E. S. Shihadeh. Thousand Oaks, CA: Sage, 1994, 192 pages.

Graphical Multivariate Analysis with AMOS, EQS and LISREL: A Visual Approach to Covariance Structure Analysis (in Japanese). Yutaka Kano. Kyoto, Japan: Gendai‐Sugakusha, 1997,235 pages. 相似文献

4.

Measuring ocean literacy of high school students: psychometric properties of a Chinese version of the ocean literacy scale

Liang-Ting Tsai 《Environmental Education Research》2019,25(2):264-279

This study established a Chinese scale for measuring high school students’ ocean literacy. This included testing its reliability, validity, and differential item functioning (DIF) with the aim of compensating for the lack of DIF tests focusing on current scales. The construct validity and reliability were verified and tested by analyzing the established scale’s items using the Rasch model, and a gender DIF test was conducted to ensure the test results’ fairness when distinct groups were compared simultaneously. The results indicated that the scale established in this study is unidimensional and possesses favorable internal consistency and construct validity. The gender DIF test results indicated that several items were difficult for either female or male students to correctly answer; however, the experts and scholars discussed these items individually and suggested retaining them. The final Chinese version of the ocean literacy scale developed here comprises 48 items that can reflect high school students’ understanding of ocean literacy—which helps students understand the topics of marine science encountered in real life. 相似文献

5.

Preliminary development of the Brief–California School Climate Survey: dimensionality and measurement invariance across teachers and administrators

Sukkyung You Meagan D. O'Malley Michael J. Furlong 《School Effectiveness & School Improvement》2013,24(1):153-173

A brief 15-item version of the California School Climate Scale (Brief-CSCS) is presented to fill a need for a measure that could be used for periodic monitoring of school personnel's general perception of the climate of their school campus. From a sample of 81,261 California school personnel, random subsamples of 2,400 teachers and 2,400 administrators were used in the analyses. Confirmatory factor analyses supported a model in which general school climate was a second-order latent factor composed of 2 first-order latent traits, organizational supports and relational supports. Measurement invariance of factor loadings for teachers and administrators was found. Additional analyses revealed that administrators held more positive perceptions of school climate than teachers, with this difference increasing from primary through high school. The implications for these findings for educational research and policy reform are outlined. 相似文献

6.

构建我国少儿英语远程计算机自适应测验题库的设想

王蕾黄晓婷《考试研究》2006,(3)

本研究利用建构图设计一套含有六大部分的30道试题。题型包括拼写题、选择题和简答题。共有175名6到14岁儿童参加了此项考试。Rasch分析结果发现题组内局部题目依赖并不严重。信度为0．85。考题的难度和考生能力的配合度相当良好。我们根据建构图来编写考题,因此有一定程度的内容效度。但有9道题的难度稍微与原先预期略有出入。有5道题不大吻合Rasch模式的预期,没有发现在性别上有明显的项目功能差异。考生能力与学习英语的时间有正相关。最后探讨了基于信息通讯技术的远程计算机自适应测验的技术问题。相似文献

7.

An Introduction to the Computerized Adaptive Testing

TIAN Jian-quan MIAO Dan-min ZHU Xia GONG Jing-jing 《美中教育评论》2007,4(1):72-81

The computerized adaptive testing （CAT） has unsurpassable advantages over the traditional testing. It has become the mainstream in large scale examination in modem society. This paper gives a brief introduction to CAT including differences between traditional testing and CAT, the principals of CAT works, Psychometric theory and computer algorithms of CAT, the advantages and cautions of CAT. In the end, the development of CAT in China is reviewed. 相似文献

8.

Estimating High School GPA Weighting Parameters With a Graded Response Model

John Hansen Philip Sadler Gerhard Sonnert 《Educational Measurement》2019,38(1):16-24

The high school grade point average (GPA) is often adjusted to account for nominal indicators of course rigor, such as “honors” or “advanced placement.” Adjusted GPAs—also known as weighted GPAs—are frequently used for computing students’ rank in class and in the college admission process. Despite the high stakes attached to GPA, weighting policies vary considerably across states and high schools. Previous methods of estimating weighting parameters have used regression models with college course performance as the dependent variable. We discuss and demonstrate the suitability of the graded response model for estimating GPA weighting parameters and evaluating traditional weighting schemes. In our sample, which was limited to self‐reported performance in high school mathematics courses, we found that commonly used policies award more than twice the bonus points necessary to create parity for standard and advanced courses. 相似文献

9.

《Journal of School Choice》2013,7(2):83-100

ABSTRACT

Stakeholder surveys conducted as part of the development of an accountability and assessment system for five charter schools in Miami-Dade County and Broward County, Florida, revealed high positive response regarding high expectations, school climate, basic skills instruction, and monitoring student progress. The lowest overall rating revealed dissatisfaction with charter school resources. Five researchers distributed questionnaires to stakeholders, defined as parents, pupils, teachers, administrators, special program teachers, and auxiliary personnel. Survey results were generally positive in assessing the schools, programs, teachers, administrators, and relationships between the various stakeholder groups. This study provided the quantitative data needed to form the framework for the development and implementation of an accountability system. 相似文献

10.

多维项目反应理论在数学素养测验中的应用

林子植胡典顺《中国考试》2021,(5):72-80

学生的数学素养具有多维结构,素养导向的数学学业成就测评需要提供被试在各维度上的表现信息,而不仅是一个单一的总分。以PISA数学素养结构为理论模型,以多维项目反应理论(MIRT)为测量模型,利用R语言的MIRT程序包处理和分析某地区8年级数学素养测评题目数据,研究数学素养的多维测量方法。结果表明:MIRT兼具单维项目反应理论和因子分析的优点,利用其可对测试的结构效度和测试题目质量进行分析,以及对被试进行多维能力认知诊断。相似文献

11.

Allison J. Ames Aaron J. Myers 《Educational and psychological measurement》2021,81(4):756

Contamination of responses due to extreme and midpoint response style can confound the interpretation of scores, threatening the validity of inferences made from survey responses. This study incorporated person-level covariates in the multidimensional item response tree model to explain heterogeneity in response style. We include an empirical example and two simulation studies to support the use and interpretation of the model: parameter recovery using Markov chain Monte Carlo (MCMC) estimation and performance of the model under conditions with and without response styles present. Item intercepts mean bias and root mean square error were small at all sample sizes. Item discrimination mean bias and root mean square error were also small but tended to be smaller when covariates were unrelated to, or had a weak relationship with, the latent traits. Item and regression parameters are estimated with sufficient accuracy when sample sizes are greater than approximately 1,000 and MCMC estimation with the Gibbs sampler is used. The empirical example uses the National Longitudinal Study of Adolescent to Adult Health’s sexual knowledge scale. Meaningful predictors associated with high levels of extreme response latent trait included being non-White, being male, and having high levels of parental support and relationships. Meaningful predictors associated with high levels of the midpoint response latent trait included having low levels of parental support and relationships. Item-level covariates indicate the response style pseudo-items were less easy to endorse for self-oriented items, whereas the trait of interest pseudo-items were easier to endorse for self-oriented items. 相似文献

12.

国际教育成效评价协会儿童认知发展状况测验项目功能差异分析 总被引：3，自引：0，他引：3

王蕾黄晓婷《考试研究》2006,(4)

本研究旨在从一维和多维的角度检测国际教育成效评价协会(IEA)儿童认知发展状况测验中中译英考题的项目功能差异(DIF)。我们分析的数据由871名中国儿童和557名美国儿童的测试数据组成。结果显示,有一半以上的题目存在实质的DIF,意味着这个测验对于中美儿童而言,并没有功能等值。使用者应谨慎使用该跨语言翻译的比较测试结果来比较中美两国考生的认知能力水平。所幸约有半数的DIF题目偏向中国,半数偏向美国,因此利用测验总分所建立的量尺,应该不至于有太大的偏误。此外,题目拟合度统计量并不能足够地检测到存在DIF的题目,还是应该进行特定的DIF分析。我们探讨了三种可能导致DIF的原因,尚需更多学科专业知识和实验来真正解释DIF的形成。相似文献

13.

关于心理测验理论模式的比较 总被引：1，自引：0，他引：1

赫云鹏王俊秀《内蒙古师范大学学报(哲学社会科学版)》1997,(4)

真分数理论和项目反应理论是心理测验的两大理论模式。真分数理论主要是估计真分数和实得分数之间关系的;项目反应理论是将被试对单个测验项目的某种反应概率与此项目的一定特征联系起来,项目反应理论可以说是在真分数理论基础上的一种发展,但绝不是真分数理论,两者所建立的理论的基本假设不同,并各有其优势与不足。今天的心理测验就是在这两大理论共存的情况下,互相促进、互相补充,并在此基础之上向更合理、更完善的方向发展相似文献

14.

高中英语词汇自适应学习系统的研制

陆宏赵艳平《现代教育技术》2014,24(11):47-52

伴随着个性化学习需求的增长,自适应学习系统的研制已经成为教育领域的研究热点。该文首先通过问卷调查了解了高中英语词汇自适应学习系统的需求;然后在文献综述的基础上,提出了本系统开发的重点内容和理论基础;接着介绍了系统功能模块的具体实现;最后,阐述了系统在教学实践中的适用对象和时间。相似文献

15.

Minimizing the Influence of Item Parameter Estimation Errors in Test Development: A Comparison of Three Selection Procedures

Mark J. Gierl Dianne Henderson Michael Jodoin Don Klinger 《Journal of Experimental Education》2013,81(3):261-279

In test development, item response theory (IRT) is a method to determine the amount of information that each item (i.e., item information function) and combination of items (i.e., test information function) provide in the estimation of an examinee's ability. Studies investigating the effects of item parameter estimation errors over a range of ability have demonstrated an overestimation of information when the most discriminating items are selected (i.e., item selection based on maximum information). In the present study, the authors examined the influence of item parameter estimation errors across 3 item selection methods—maximum no target, maximum target, and theta maximum—using the 2- and 3-parameter logistic IRT models. Tests created with the maximum no target and maximum target item selection procedures consistently overestimated the test information function. Conversely, tests created using the theta maximum item selection procedure yielded more consistent estimates of the test information function and, at times, underestimated the test information function. Implications for test development are discussed. 相似文献

16.

项目反应理论在中考命题质量评价中的应用

赵娟场建芹《大连教育学院学报》2014,(1):17-19

应用项目反应理论对中考命题质量进行分析,可以排除抽样干扰,准确评估试题的难度,客观精细地描述试题的区分度,评估整套试卷和各试题对学生能力估计的精度,查找赋分标准和阅卷过程中存在的问题。相似文献

17.

Pere J. Ferrando David Navarro-Gonzlez 《Educational and psychological measurement》2021,81(6):1029

Item response theory “dual” models (DMs) in which both items and individuals are viewed as sources of differential measurement error so far have been proposed only for unidimensional measures. This article proposes two multidimensional extensions of existing DMs: the M-DTCRM (dual Thurstonian continuous response model), intended for (approximately) continuous responses, and the M-DTGRM (dual Thurstonian graded response model), intended for ordered-categorical responses (including binary). A rationale for the extension to the multiple-content-dimensions case, which is based on the concept of the multidimensional location index, is first proposed and discussed. Then, the models are described using both the factor-analytic and the item response theory parameterizations. Procedures for (a) calibrating the items, (b) scoring individuals, (c) assessing model appropriateness, and (d) assessing measurement precision are finally discussed. The simulation results suggest that the proposal is quite feasible, and an illustrative example based on personality data is also provided. The proposals are submitted to be of particular interest for the case of multidimensional questionnaires in which the number of items per scale would not be enough for arriving at stable estimates if the existing unidimensional DMs were fitted on a separate-scale basis. 相似文献

18.

基于项目反应理论的测验编制方法研究 总被引：3，自引：0，他引：3

戴海琦《考试研究》2006,(4)

本文在简单介绍项目反应理论的基础上,从计量分析的角度,深入探讨了应用项目反应理论编制各种测验的一般步骤;探讨了项目反应理论题库建设方法及基于题库的测验编制方法;探讨了标准参照测验合格分数线的划分方法。相似文献

19.

An NCME Instructional Module on Booklet Designs in Large-Scale Assessments of Student Achievement: Theory and Practice 总被引：2，自引：0，他引：2

Andreas Frey Johannes Hartig André A. Rupp 《Educational Measurement》2009,28(3):39-53

In most large-scale assessments of student achievement, several broad content domains are tested. Because more items are needed to cover the content domains than can be presented in the limited testing time to each individual student, multiple test forms or booklets are utilized to distribute the items to the students. The construction of an appropriate booklet design is a complex and challenging endeavor that has far-reaching implications for data calibration and score reporting. This module describes the construction of booklet designs as the task of allocating items to booklets under context-specific constraints. Several types of experimental designs are presented that can be used as booklet designs. The theoretical properties and construction principles for each type of design are discussed and illustrated with examples. Finally, the evaluation of booklet designs is described and future directions for researching, teaching, and reporting on booklet designs for large-scale assessments of student achievement are identified. 相似文献

20.

Lun Mo Fang Yang Xiangen Hu 《Educational Research and Evaluation》2013,19(1):33-45

School climate surveys are widely applied in school districts across the nation to collect information about teacher efficacy, principal leadership, school safety, students' activities, and so forth. They enable school administrators to understand and address many issues on campus when used in conjunction with other student and staff data. However, these days each district develops the questionnaire according to its own needs and rarely provides supporting evidence for the reliability of items in the scale, that is, whether an individual item contributes significant information to the questionnaire. The Item Response Theory (IRT) is a useful tool that helps examine how much information each item and the whole scale can provide. Our study applied IRT to examine individual items in a school climate survey and assessed the efficiency of the survey after the removal of items that contributed little to the scale. The purpose of this study is to show how IRT can be applied to empirically validate school climate surveys. 相似文献