期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bias and Bias Correction Method for Nonproportional Abilities Requirement (NPAR) Tests

Edward H. Ip Tyler Strachan Yanyan Fu Alexandra Lay John T. Willse Shyh‐Huei Chen Leslie Rutkowski Terry Ackerman 《Journal of Educational Measurement》2019,56(1):147-168

Test items must often be broad in scope to be ecologically valid. It is therefore almost inevitable that secondary dimensions are introduced into a test during test development. A cognitive test may require one or more abilities besides the primary ability to correctly respond to an item, in which case a unidimensional test score overestimates the primary ability and creates interpretability problems. In this article, we demonstrate the nonproportional abilities requirement, a phenomenon with which secondary abilities are more required for difficult items. A novel and practical method for correcting bias in the primary ability is proposed and illustrated using a real data set from an international assessment. Simulation data are also used to evaluate the performance of the method. 相似文献

2.

Retest effects in matrix test performance: Differential impact of predictors at different hierarchy levels in an educational setting

Philipp Alexander Freund Heinz Holling 《Learning and individual differences》2011,21(5):597-601

If tests of cognitive ability are repeatedly taken, test scores rise. Such retest effects have been observed for a long time and for a variety of tasks. This study investigates retest effects on figural matrix items in an educational context. A short term effect is assumed for the direct retest administration in the same test session, and a long term effect is assumed for a retest interval of six months. Using multilevel modeling, we analyze if the magnitude of these effects is not only influenced by individual variation, but also by the cluster structure of students grouped within classrooms. We also investigate if the use of identical versus parallel tests has an impact on the size of the retest effects. Our main results show a negligible short term retest effect, but a large long term retest effect. Using parallel tests does not contribute to understanding individual differences in retest effects. The variation in retest effects is larger between classrooms than between students. Reasoning ability, as measured with a different test, and school grades significantly influences retest effects at the individual level, but at the classroom level, only reasoning ability is a significant predictor. 相似文献

3.

Evaluating the Consistency of Test Content Across Two Successive Administrations of a State-Mandated Science Assessment

Timothy O'Neil Stephen G. Sireci Kristen L. Huff 《Educational Assessment》2013,18(3-4):129-151

Educational tests used for accountability purposes must represent the content domains they purport to measure. When such tests are used to monitor progress over time, the consistency of the test content across years is important for ensuring that observed changes in test scores are due to student achievement rather than to changes in what the test is measuring. In this study, expert science teachers evaluated the content and cognitive characteristics of the items from 2 consecutive annual administrations of a 10th-grade science assessment. The results indicated the content area representation was fairly consistent across years and the proportion of items measuring the different cognitive skill areas was also consistent. However, the experts identified important cognitive distinctions among the test items that were not captured in the test specifications. The implications of this research for the design of science assessments and for appraising the content validity of state-mandated assessments are discussed. 相似文献

4.

No g in education?

Martin Brunner 《Learning and individual differences》2008,18(2):152-165

This study investigates the relationships of domain-general cognitive abilities and domain-specific verbal and mathematical abilities to students' educational characteristics when two theoretically grounded, but competing structural models are applied. In the standard model, a single latent ability causes interindividual differences in the corresponding measures. In the nested-factor model, interindividual differences are caused by two independent cognitive abilities: general cognitive ability and domain-specific ability. The two models were examined using data from 29,386 ninth graders. The results show that findings on the relations between domain-specific abilities and students' socio-economic status, general school satisfaction, educational aspirations, domain-specific interests, and subject-specific grades may differ substantially depending on the structural model applied. Implications for educational research and measurement as well as for students' motivational and cognitive development are discussed. 相似文献

5.

An Empirical Investigation Demonstrating the Multidimensional DIF Paradigm: A Cognitive Explanation for DIF

Cindy M. Walker S. Natasha Beretvas 《Journal of Educational Measurement》2001,38(2):147-163

Differential Item Functioning (DIF) is traditionally used to identify different item performance patterns between intact groups, most commonly involving race or sex comparisons. This study advocates expanding the utility of DIF as a step in construct validation. Rather than grouping examinees based on cultural differences, the reference and focal groups are chosen from two extremes along a distinct cognitive dimension that is hypothesized to supplement the dominant latent trait being measured. Specifically, this study investigates DIF between proficient and non-proficient fourth- and seventh-grade writers on open-ended mathematics test items that require students to communicate about mathematics. It is suggested that the occurrence of DIF in this situation actually enhances, rather than detracts from, the construct validity of the test because, according to the National Council of Teachers of Mathematics (NCTM), mathematical communication is an important component of mathematical ability, the dominant construct being assessed. However, the presence of DIF influences the validity of inferences that can be made from test scores and suggests that two scores should be reported, one for general mathematical ability and one for mathematical communication. The fact that currently only one test score is reported, a simple composite of scores on multiple-choice and open-ended items, may lead to incorrect decisions being made about examinees. 相似文献

6.

Item Discrimination: When More Is Worse

Geofferey N. Masters 《Journal of Educational Measurement》1988,25(1):15-29

High item discrimination can be a symptom o f a special kind of measurement disturbance introduced by an item that gives persons o f high ability a special advantage over and above their higher abilities. This type o f disturbance, which can be interpreted as a form o f item "bias," can be encouraged by methods that routinely interpret highly discriminating items as the "best" items on a test and may be compounded by procedures that weight items by their discrimination. The type of measurement disturbance described and illustrated in this paper occurs when an item is sensitive to individual differences on a second, undesired dimension that is positively correlated with the variable intended to be measured. Possible secondary influences o f this type include opportunity to learn, opportunity to answer, and test wiseness 相似文献

7.

Neuropsychological Assessment of First‐Year Architecture Students’ Visuospatial Abilities: Overview

Aktan Acar A. ebnem Soysal Acar 《The International Journal of Art & Design Education》2020,39(1):211-226

First‐year architecture students are expected to utilise visuospatial abilities to generate/construct, retain, rotate and manipulate space mentally and physically through physical and digital representations. This study of 57 female and 23 male participants was conducted to investigate first‐year architecture students’ visuospatial abilities by means of the Beck Depression Inventory, Logical Reasoning Test and Judgment of Line Orientation (JLO) test. Participants’ sexes, cognitive development level, depression scale scores, university entrance exam results, vision disorders, physical competences, art training prior to university and error types were the study’s main parameters. The results showed that academic scores of the participants both to enrol in the program and complete the first‐year studio did not correlate with their JLO scores. Nondepressed participants performed better in JLO. Error analyses demonstrated that there is a concentration on certain items according to the test stimulus line positions, especially in females. Those who reported limited physical and visual competency made more mistakes in the same items. The study concludes that sex, depression, and individual differences in physical and visual competency, and art training, are significant variables for visuospatial performance. Judging visuospatial parameters through spatial design exercises is different from having proper methods and instruments to assess the achievements of the students regarding those abilities in architectural design education. It is important to map students’ visuospatial abilities individually from a developmental perspective. There is a strong need to develop 4D psychometric instrument to assess visuospatial abilities. 相似文献

8.

Improving Construct Validity With Cognitive Psychology Principles 总被引：2，自引：0，他引：2

Susan Embretson Joanna Gorin 《Journal of Educational Measurement》2001,38(4):343-368

Cognitive psychology principles have been heralded as possibly central to construct validity. In this paper, testing practices are examined in three stages: (a) the past, in which the traditional testing research paradigm left little role for cognitive psychology principles, (b) the present, in which testing research is enhanced by cognitive psychology principles, and (c) the future, for which we predict that cognitive psychology's potential will be fully realized through item design. An extended example of item design by cognitive theory is given to illustrate the principles. A spatial ability test that consists of an object assembly task highlights how cognitive design principles can lead to item generation. 相似文献

9.

Spatial ability and achievement in introductory physics

George J. Pallrand Fred Seeber 《科学教学研究杂志》1984,21(5):507-516

This research was undertaken to clarify the nature of the relationship between visual-spatial abilities and achievement in science courses. A related purpose was to determine what influence visual-spatial abilities have on the high attribution rate characteristic of many introductory college-level science courses. Three sections of introductory college level physics (S = 136) and one nonscience liberal arts section (S = 52) received pre- and postmeasures of visual-spatial ability in the areas of perception, orientation, and visualization. Increases in visual-spatial abilities were greatest with an experimental section that received a spatial intervention. These gains were related to test items that utilized graphical form and to laboratory work. Substantial gains in visual-spatial ability were also registered by a placebo and by control sections. These increases suggest that taking introductory physics improves visual-spatial abilities. Although students who withdrew from the course demonstrated mathematics skills comparable to those of students who completed the course, their scores on perception tests were appreciably lower. Visual-spatial scores of the liberal arts group were lower than those of the physics sections, suggesting that visual-spatial ability influences course selection. 相似文献

10.

A Comparison of Item Selection Routines in Linear and Adaptive Tests

Deborah L. Schnipke Bert F. Green 《Journal of Educational Measurement》1995,32(3):227-242

Two item selection algorithms were compared in simulated linear and adaptive tests of cognitive ability. One algorithm selected items that maximally differentiated between examinees. The other used item response theory (IRT) to select items having maximum information for each examinee. Normally distributed populations of 1,000 cases were simulated, using test lengths of 4, 5, 6, and 7 items. Overall, adaptive tests based on maximum information provided the most information over the widest range of ability values and, in general, differentiated among examinees slightly better than the other tests. Although the maximum differentiation technique may be adequate in some circumstances, adaptive tests based on maximum information are clearly superior. 相似文献

11.

Psychometric Aspects of Maintaining Standards of Examinations

C. A. W. Glas 《教育心理学》1988,8(4):257-270

Through pilot studies and regular examination procedures, the National Institute for Educational Measurement (CITO) in The Netherlands has gathered experience with different methods of maintaining the standards of examinations. The present paper presents an overview of the psychometric aspects of the various approaches that can be chosen for the maintenance of standards. Generally speaking, the approaches to the problem, can be divided into two classes. In the first approach the examinations are a fixed factor, i.e. the examination is already constructed and cannot be changed, and the link between the standards of both examinations is created by some test equating design. In the second approach the items of both examinations are selected from a pre‐tested pool of items, in such a way that two equivalent examinations are constructed. In both approaches the statistical problems of simultaneously modelling possible differences in the ability level of different groups of examinees and differences in the difficulty of the items are solved within the framework of item response theory. It is shown that applying the Rasch model for dichotomous and polytomous items results in a variety of possible test‐equating designs which adequately deal with the restrictions imposed by the practical conditions related to the fact that the equating involves examinations. Especially the requirement of secrecy of the content of new examinations must be taken into account. Finally it is shown that, given a pool of pre‐tested items, optimisation techniques can be used to construct equivalent examinations. 相似文献

12.

Testing Students with Special Needs: A Model for Understanding the Interaction Between Assessment and Student Characteristics in a Universally Designed Environment

Leanne R. Ketterlin‐Geller 《Educational Measurement》2008,27(3):3-16

This article presents a model of assessment development integrating student characteristics with the conceptualization, design, and implementation of standardized achievement tests. The model extends the assessment triangle proposed by the National Research Council ( Pellegrino, Chudowsky, & Glaser, 2001 ) to consider the needs of students with disabilities and English learners on two dimensions: cognitive interaction and observation interaction. Specific steps in the test development cycle for including students with special needs are proposed following the guidelines provided by Downing (2006) . Because this model of test development considers the range of student needs before test development commences, student characteristics are supported by applying the principles of universal design and appropriately aligning accommodations to address student needs. Specific guidelines for test development are presented. 相似文献

13.

范畴的原则及其运用 总被引：1，自引：0，他引：1

陈文荣《福建教育学院学报》2004,5(1):94-95

在众多认知能力当中,范畴化是最重要的.范畴化,即异中求同.认知经济和认知世界结构是范畴化的两个原则.这些原则在分类学中应用广泛,特别是对基本层次范畴产生重大影响. 相似文献

14.

浅析阅读理解考试中的测试方法效应问题

冯悦《广东技术师范学院学报》2007,(6):89-93

本文研究的是不同的测试方法-单项选择和信息转移-是否会在阅读理解考试中产生测试方法效应的问题.除对学生的考试成绩(分数)进行分析外,本研究还进一步对试题的难度值进行了分析,而本研究中试题难度是通过项目反应理论(Item Response Theory)计算得到的.结果显示不同测试方法的确会影响题目难度及考生的考试表现,就试题难度而言信息转移比单项选择更难. 相似文献

15.

Parsing the notion of algebraic thinking within a cognitive perspective

Maria Chimoni Demetra Pitta-Pantazi 《教育心理学》2017,37(10):1186-1205

There is a growing consensus that algebra is an important aspect of mathematics teaching and learning and several abilities are required in order students to have successful performance in algebra. The present study uses insights from the domain of psychology to enrich what is currently known in the domain of mathematics education about the relationship of algebraic thinking with abilities involved in fundamental cognitive processes. In total, 190 students between the ages of 13–17 years old were tested through two tests. The first test addressed four types of cognitive systems which are responsible for the representation and processing of different types of relations in the environment: the spatial-imaginal, the causal-experimental, the qualitative-analytic and the verbal-propositional. The second test addressed algebraic thinking. The results support the key role of the four types of cognitive processes in students’ algebraic thinking. The results also suggest that abilities involved in the four types of cognitive processes predict algebraic thinking abilities, irrespective of the age of the students. 相似文献

16.

Assessing the cognitive abilities of culturally and linguistically diverse students: Predictive validity of verbal,quantitative, and nonverbal tests

Joni M. Lakin 《Psychology in the schools》2012,49(8):756-768

Verbal and quantitative reasoning tests provide valuable information about cognitive abilities that are important to academic success. Information about these abilities may be particularly valuable to teachers of students who are English‐language learners (ELL), because leveraging reasoning skills to support comprehension is a critical aptitude for their academic success. However, due to concerns about cultural bias, many researchers advise exclusive use of nonverbal tests with ELL students despite a lack of evidence that nonverbal tests provide greater validity for these students. In this study, a test measuring verbal, quantitative, and nonverbal reasoning was administered to a culturally and linguistically diverse sample of students. The two‐year predictive relationship between ability and achievement scores revealed that nonverbal scores had weaker correlations with future achievement than did quantitative and verbal reasoning ability scores for ELL and non‐ELL students. Results do not indicate differential prediction and do not support the exclusive use of nonverbal tests for ELL students. © 2012 Wiley Periodicals, Inc. 相似文献

17.

Multidimensional IRT models for the assessment of competencies

Johannes Hartig Jana Höhler 《Studies in Educational Evaluation》2009,35(2-3):57-63

Multidimensional item response theory (MIRT) provides an ideal foundation for modeling performance in complex domains, taking into account multiple basic abilities simultaneously, and representing different mixtures of the abilities required for different test items. This article provides a brief overview of different MIRT models, and the substantive implications of their differences for educational assessment. To illustrate the flexibility and benefits of MIRT, three application scenarios are described: to account for unintended multidimensionality when measuring a unidimensional construct, to model latent covariance structures between ability dimensions, and to model interactions of multiple abilities required for solving specific test items. All of these scenarios are illustrated by empirical examples. Finally, the implications of using MIRT models on educational processes are discussed. 相似文献

18.

The etiology of giftedness

Lee Anne Thompson Jeremy Oehlert 《Learning and individual differences》2010,20(4):298-307

Many theories of giftedness either explicitly or implicitly acknowledge the role of genetic influences; yet, empirical work has not been able to establish the impact that genes have specifically on gifted behavior. In contrast, a great deal of research has been targeted at understanding the etiology of individual differences in general and specific cognitive abilities across the entire range of ability and to a lesser extent, high cognitive ability. This paper attempts to outline what we know and what we don't know about the etiology of giftedness as operationally defined as high g. We review studies selected to represent a variety of approaches that each address a different question about genetics and giftedness. These studies include quantitative genetic research which estimate heritability, shared and nonshared family environment – at the high and low ends of intelligence – as well as the heritability of group differences for general cognitive ability and specific cognitive abilities. We discuss the molecular genetic methods and mechanisms contributing to cognitive ability and suggest how epigenetic factors may operate. Quantitative and molecular genetic studies that include endophenotypes representing intelligence at a level closer to the genotype, are also included. This last group of studies represent a relatively new area of work that builds on and extends the extensive groundwork established by classic quantitative genetic studies of behavior. 相似文献

19.

不同年级学生认知灵活性研究

李美华沈德立白学军《中国特殊教育》2007,(8):80-86

为了考察不同年级学生认知灵活性发展情况,从小学三年级、五年级、初中二年级、高中二年级中分别挑选被试各80人,进行认知灵活性测量。结果表明认知灵活性随着年级的增长而发展,各年级学生的两项认知灵活性测试成绩与语文、数学成绩有不同程度的正、负相关。具体表现为:威斯康星卡片分类测试的各项目与数学成绩的相关都达到了十分显著的水平,而与语文成绩的相关除了个别项目达到了显著水平之外,其它项都没有达到显著水平;连线测试成绩与语文、数学成绩的相关除了个别项目达到了显著水平之外,其它项都没有达到显著水平。从而说明认知灵活性与学生的语文、数学成绩关系还是较密切,也可以说认知灵活性是有效学习的保障。相似文献

20.

2004高考(上海卷)地理科考试评价

雷新勇《考试研究》2005,(1)

2004年高考(上海卷)地理试卷包含两大部分:选择题和综合分析题。选择题部分共20题,每题2分,计40分。综合分析题部分有八大题,34个小题, 110个得分点。主要从经典的试题分析、考试结果的信度、考试效度的内容和结构方面的证据以及考试对教育教学的影响等几个角度对地理考试进行评价,得出下列结论:地理考试的能力目标是根据课程标准制定的,命题以课程标准为依据,难度略偏易,有一定的区分度;试卷的题量适中;选择题与非选择题比例适中,对学校的教育和教学有较好的导向作用。然而,综合分析题部分图文信息阅读量较大,应答文字表述较少,难以比较系统地考查考生独立的地理思维能力,这对教学的导向是不利的。相似文献