期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Relationships between cognitive diagnosis, CTT, and IRT indices: an empirical investigation

Young-Sun Lee Jimmy de la Torre Yoon Soo Park 《Asia Pacific Education Review》2012,13(2):333-345

Cognitive diagnosis models (CDMs) continue to generate interest among researchers and practitioners because they can provide diagnostic information relevant to classroom instruction and student learning. However, its modeling component has outpaced its complementary component??test construction. Thus, most applications of cognitive diagnosis modeling involve retrofitting of CDMs to assessments constructed using classical test theory (CTT) or item response theory (IRT). This study explores the relationship between item statistics used in the CTT, IRT, and CDM frameworks using such an assessment, specifically a large-scale mathematics assessment. Furthermore, by highlighting differences between tests with varying levels of diagnosticity using a measure of item discrimination from a CDM approach, this study empirically uncovers some important CTT and IRT item characteristics. These results can be used to formulate practical guidelines in using IRT- or CTT-constructed assessments for cognitive diagnosis purposes. 相似文献

2.

A COMPREHENSIVE MICROCOMPUTER SYSTEM FOR CLASSROOM TESTING

ANTHONY J. NITKO TSE-CHI HSU 《Journal of Educational Measurement》1984,21(4):377-390

相似文献

3.

Dimensions of students’ views of classroom teaching and attitudes towards mathematics: A multi-group analysis between genders based on structural equation models

《Studies in Educational Evaluation》2023

The study intends to examine the dimensions of teaching perceived by students in mathematics and their relations with attitudes towards mathematics (ATM) across genders. In this study, 602 students (M_age = 13.55, 48.34% male) from Chinese middle schools participated in a questionnaire assessing their perceptions of general and content-focused instruction and ATM. The Rasch model demonstrated that the six-factor model of classroom teaching is superior to one-factor and two-factor (i.e., general and content-focused instruction) models. The links between dimensions of teaching and attitude compositions were shown to be different across genders. Further, dimensions of teaching were important for boys' enjoyment and value and girls' boredom and self-concept in mathematics. Lastly, compared to content-focused instruction, the explanation degree of the general instruction to ATM is higher in both boys and girls. 相似文献

4.

Modeling Instructional Sensitivity Using a Longitudinal Multilevel Differential Item Functioning Approach

Alexander Naumann Jan Hochweber Johannes Hartig 《Journal of Educational Measurement》2014,51(4):381-399

Students’ performance in assessments is commonly attributed to more or less effective teaching. This implies that students’ responses are significantly affected by instruction. However, the assumption that outcome measures indeed are instructionally sensitive is scarcely investigated empirically. In the present study, we propose a longitudinal multilevel‐differential item functioning (DIF) model to combine two existing yet independent approaches to evaluate items’ instructional sensitivity. The model permits for a more informative judgment of instructional sensitivity, allowing the distinction of global and differential sensitivity. Exemplarily, the model is applied to two empirical data sets, with classical indices (Pretest–Posttest Difference Index and posttest multilevel‐DIF) computed for comparison. Results suggest that the approach works well in the application to empirical data, and may provide important information to test developers. 相似文献

5.

An Examination of the Instructional Sensitivity of the TIMSS Math Items: A Hierarchical Differential Item Functioning Approach

Hongli Li Qi Qin Pui-Wa Lei 《Educational Assessment》2017,22(1):1-17

In recent years, students’ test scores have been used to evaluate teachers’ performance. The assumption underlying this practice is that students’ test performance reflects teachers’ instruction. However, this assumption is generally not empirically tested. In this study, we examine the effect of teachers’ instruction on test performance at the item level using a hierarchical differential item functioning approach. The items are from the U.S. TIMSS 2011 4th-grade math test. Specifically, we tested whether students who had received instruction on a given item performed significantly better on that item compared with students who had not received such instruction when their overall math ability was controlled for, whether with or without controlling for student-level and class-level covariates. This study provides preliminary findings regarding why some items show instructional sensitivity and sheds light on how to develop instructionally sensitive items. Implications and directions for further research are also discussed. 相似文献

6.

DIFFERENCES IN INSTRUCTIONAL SENSITIVITY BETWEEN ITEM FORMATS AND BETWEEN ACHIEVEMENT TEST ITEMS

RALPH A. HANSON ROBERT F. MCMORRIS JERRY D. BAILEY 《Journal of Educational Measurement》1986,23(1):1-12

Building achievement tests which are sensitive to the instructional effects of school programs concerns both practitioners and researchers in education. To produce such tests, empirical procedures to guide item selection are needed. In this paper, an operational framework and a set of empirical procedures for this task are presented. Within this framework, item sensitivity is linked to instructional implementation. A simple components of variance model has been used to provide actual estimates of instructional sensitivity. These procedures are illustrated using data from a comparative study of alternative item formats for a criterion-referenced test. Even when items were closely matched to instructional content specifications, important differences in instructional sensitivity emerged. These differences were found between the same items presented in different formats as well as between different items presented within the same format. Implications of these results for developing criterion-referenced achievement tests are discussed. 相似文献

7.

The social conflict inventory (SCI): A measure of beliefs about classroom peer conflicts

Dora W. Chen Kenneth E. Smith 《Journal of Early Childhood Teacher Education》2013,34(4):299-313

Abstract

This study describes the development of the Social Conflict Inventory (SCI), a self‐report teacher belief scale for assessing beliefs about young children's classroom peer conflicts. Three phases were involved in the construction of the SCI: item development, initial testing with one sample (n = 218), and follow‐up field test with a second sample (n = 395) that also addressed the convergent and concurrent validity of the instrument. Reliability and factor analyses conducted during the initial field test resulted in a reduction to 20 items (Cronbach’ s α = .87) with three subscales: General Orientation to Peer Conflict (α = .81), Cessation (α = .84), and Facilitation (α = .65). Similar patterns of factor loadings and reliabilities resulted from analyses of the follow‐up field test data. Overall, the SCI proved to be a reliable instrument for assessing the beliefs concerning the role of classroom conflicts in children's development and for differentiating among groups of teachers. Further use of the SCI in conjunction with other measures of teacher beliefs will contribute to a better understanding of its concurrent validity. Finally, the potential for its use in future studies to clarify the relationship between beliefs and actual classroom practices and as an instrument for assessing the effectiveness of specific classroom management training programs is discussed. 相似文献

8.

Developing and evaluating instructionally sensitive assessments in science

Maria Araceli Ruiz‐Primo Min Li Kellie Wills Michael Giamellaro Ming‐Chih Lan Hillary Mason Deanna Sands 《科学教学研究杂志》2012,49(6):691-712

The purpose of this article is to address a major gap in the instructional sensitivity literature on how to develop instructionally sensitive assessments. We propose an approach to developing and evaluating instructionally sensitive assessments in science and test this approach with one elementary life‐science module. The assessment we developed was administered to 125 students in seven classrooms. The development approach considered three dimensions of instructional sensitivity; that is, assessment items should: represent the curriculum content, reflect the quality of instruction, and have formative value for teaching. Focusing solely on the first dimension, representation of the curriculum content, this study was guided by the following research questions: (1) What science module characteristics can be systematically manipulated to develop items that prove to be instructionally sensitive? and (2) Are the instructionally sensitive assessments developed sufficiently valid to make inferences about the impact of instruction on students' performance? In this article, we describe our item development approach and provide empirical evidence to support validity arguments about the developed instructionally sensitive items. Results indicated that: (1) manipulations of the items at different proximities to vary their sensitivity were aligned with the rules for item development and also corresponded with pre‐to‐post gains; and (2) the items developed at different distances from the science module showed a pattern of pre‐to‐post gain consistent with their instructional sensitivity, that is, the closer the items were to the science module, the larger the observed gains and effect sizes. © 2012 Wiley Periodicals, Inc. J Res Sci Teach 49: 691–712, 2012 相似文献

9.

Exploring alternative conceptions from Newtonian dynamics and simple DC circuits: Links between item difficulty and item confidence

Maja Planinic William J. Boone Rudolf Krsnik Meredith L. Beilfuss 《科学教学研究杂志》2006,43(2):150-171

Croatian 1st‐year and 3rd‐year high‐school students (N = 170) completed a conceptual physics test. Students were evaluated with regard to two physics topics: Newtonian dynamics and simple DC circuits. Students answered test items and also indicated their confidence in each answer. Rasch analysis facilitated the calculation of three linear measures: (a) an item‐difficulty measure based upon all responses, (b) an item‐confidence measure based upon correct student answers, and (c) an item‐confidence measure based upon incorrect student answers. Comparisons were made with regard to item difficulty and item confidence. The results suggest that Newtonian dynamics is a topic with stronger students' alternative conceptions than the topic of DC circuits, which is characterized by much lower students' confidence on both correct and incorrect answers. A systematic and significant difference between mean student confidence on Newtonian dynamics and DC circuits items was found in both student groups. Findings suggest some steps for physics instruction in Croatia as well as areas of further research for those in science education interested in additional techniques of exploring alternative conceptions. © 2005 Wiley Periodicals, Inc. J Res Sci Teach 43: 150–171, 2006 相似文献

10.

Identifying national cultures of mathematics education: Analysis of cognitive demands and differential item functioning in TIMSS

Eckhard Klieme Jürgen Baumert 《European Journal of Psychology of Education - EJPE》2001,16(3):385-402

Large-scale assessments of student competencies address rather broad constructs and use parsimonious, unidimensional measurement models. Differential item functioning (DIF) in certain subpopulations usually has been interpreted as error or bias. Recent work in educational measurement, however, assumes that DIF reflects the multidimensionality that is inherent in broad competency constructs and leads to differential achievement profiles. Thus, DIF parameters can be used to identify the relative strengths and weaknesses of certain student subpopulations. The present paper explores profiles of mathematical competencies in upper secondary students from six countries (Austria, France, Germany, Sweden, Switzerland, the US). DIF analyses are combined with analyses of the cognitive demands of test items based on psychological conceptualisations of mathematical problem solving. Experts judged the cognitive demands of TIMSS test items, and these demand ratings were correlated with DIF parameters. We expected that cultural framings and instructional traditions would lead to specific aspects of mathematical problem solving being fostered in classroom instruction, which should be reflected in differential item functioning in international comparative assessments. Results for the TIMSS mathematics test were in line with expectations about cultural and instructional traditions in mathematics education of the six countries. 相似文献

11.

The Effect of Training in Test Item Writing on Test Performance of Junior High Students

Jeanne Tunks 《Educational studies》2001,27(2):129-142

High stakes testing, a phenomena born out of intense accountability across the United States, produces instructional settings that marginalize both curriculum and instruction. Teachers and other school personnel have minimized instruction to drill and practice in an effort to raise standardized and criterion referenced test scores. This study presents an alternative to current practice that engages students in learning and increases their awareness of the internal aspects of standardized tests. The Test Item Construction Model (TICM) guides students through the process of studying test item stems and subsequently creating items using a 12 week process of incrementing from understanding to creating test items. Students grew in their understanding of the test item stems and the generation of these. An ANOVA did not yield significant differences between random groups of trained and untrained test writers. However, students in the experimental group demonstrated gains in understanding of test items. 相似文献

12.

The relative responsiveness of concrete operational seventh grade and college students to science instruction

Anton E. Lawson 《科学教学研究杂志》1982,19(1):63-77

Numerous persons have suggested that instruction should match the developmental level of the learner. Are “concrete operational” college students developmentally the same as “concrete operational” seventh grade students thus in need of identical instruction? Matched concrete operational seventh grade and college students were given identical classroom instruction in probabilistic and correlational reasoning. The college students performed significantly better on posttest measures which appeared to require greater processing of information while significant differences did not exist on less difficult items. Level of cognitive development, field independence, and fluid intelligence correlated moderately with posttest performance for the seventh grade students. Field independence and fluid intelligence correlated moderately with posttest performance for the college students but not pretest knowledge of specific biological concepts and cognitive level. It was concluded that college students are more responsive to instruction due either to (1) greater amount of experience or (2) greater information processing capacity. Implications for science teaching are discussed. 相似文献

13.

Uncovering everyday dynamics in students’ perceptions of instructional quality with experience sampling

《Learning and Instruction》2022

Within-student dynamics in perceptions of instructional quality have been neglected, although student states constitute a major share of these perceptions. The present study examined the structure and correlates of student state perceptions of the three basic dimensions, teacher support, cognitive activation, and classroom management. We conducted a three-week experience sampling study using state measures in four subjects (observations: n_mathematics = 2,681, n_physics = 1,555, n_German = 2,026, n_English = 1,835) and analyzed data from 372 German secondary school students (M_age = 15.3 years), conducting two-level confirmatory factor analyses. Against more parsimonious solutions, the postulated three-factor structure was confirmed within- and between-students across subjects, entailing 51% within-student variance on average. Similar to trait-like perceptions, state perceptions were positively related to grades and academic interest. Our results support the factorial and convergent validity of state student perceptions of instructional quality, expanding upon between-person-based literature and uncovering opportunities to enhance teaching effectiveness. 相似文献

14.

Effectiveness of Short-Term Group Guidance with a Group of Transfer Students Admitted on Academic Probation

《The Journal of educational research》2012,105(10):463-465

Abstract

To combat problems of cheating arising from testing under crowed classroom conditions, instructors frequently use multiple arrangements of a set of test items. These different arrangements or forms should be nearly equivalent relative to mean total scores. This study reports data from comparisons involving eleven pairs of equivalent tests. There were no significant linear relationships between equivalent test forms on the ordering of item difficulties. Reliabilities differed little within pairs of equivalent tests. Nine of eleven t-tests comparing mean total test scores were insignificant. The bulk of these data supported the assumption that one may construct equivalent power tests by rearranging items, when the ordering of item difficulty is non-systematic on both arrangements. 相似文献

15.

Control–value appraisals and academic emotions: An intensive longitudinal examination of reciprocal effects

Xin Chen Frederick K. S. Leung 《Child development》2024,95(3):972-987

This study examined the reciprocal relation between lesson-specific perceived cognitive appraisals and academic emotions on an intra-individual level. A daily diary study was conducted using a sample of 266 Chinese Han students (Grades 7–8; 56.8% boys; M_age = 13.70, SD_age = 0.52) during 10 mathematics lessons in 2022. Standardized questionnaires were also administered to these students before the daily diary study. The results of the dynamic structural equation modeling revealed significant reciprocal relations between cognitive appraisals and academic emotions within early adolescents and highlighted the role of emotions in guiding cognitive appraisals. Additionally, the study identified similarities and differences in the inter-individual relation between appraisals and emotions across self-reported questionnaires and daily diary measures. 相似文献

16.

Improved application of the control-of-variables strategy as a collateral benefit of inquiry-based physics education in elementary school

《Learning and Instruction》2019

In a quasi-experimental classroom study, we longitudinally investigated whether inquiry-based, content-focused physics instruction improves students’ ability to apply the control-of-variables strategy, a domain-general experimentation skill. Twelve third grade elementary school classes (Mdn_age = 9 years, N = 189) were randomly assigned to receive either four different physics curriculum units (intervention) or traditional instruction (control). Experiments were frequent elements in the physics units; however, there was no explicit instruction of the control-of-variables strategy or other experimentation skills. As intended, students in the intervention classes strongly increased their conceptual physics knowledge. More importantly, students in the intervention classes also showed stronger gains in their ability to apply the control-of-variables strategy correctly in novel situations compared to students in the control classes. Thus, a high dose of experimentation had the collateral benefit of improving the transfer of the control-of-variables strategy. The study complements lab-based studies with convergent findings obtained in real classrooms. 相似文献

17.

Reversing the Downward Spiral of Science Instruction in K-2 Classrooms

Judith Haymore Sandholtz Cathy Ringstaff 《Journal of Science Teacher Education》2011,22(6):513-533

This study investigated the extent to which teacher professional development led to changes in science instruction in K-2 classrooms in rural school districts. The research specifically examined changes in (a) teachers’ content knowledge in science; (b) teachers’ self-efficacy related to teaching science; (c) classroom instructional time allotted to science; and (d) instructional strategies used in science. The study also investigated contextual factors contributing to or hindering changes in science instruction. Data sources included a teacher survey, a self-efficacy assessment, content knowledge tests, interviews, and classroom observations. After one year in the program, teachers showed increased content knowledge and self-efficacy in teaching science; they spent more instructional time on science and began using different instructional strategies. Key contextual factors included curricular demands, resources, administrative support, and support from other teachers. 相似文献

18.

Effect of Two Selected Item-Writing Practices on Test Difficulty,Discrimination, and Reliability

Cynthia B. Schmeiser Douglas R. Whitney 《Journal of Experimental Education》2013,81(3):30-34

In order to investigate the effect of two item-writing practices on test characteristics, examinations were chosen for study in two undergraduate courses (N = 71 and 210) . About one-fourth of the items on each examination included a practice generally regarded as undesirable in measurement textbooks and alleged to make test items more difficult. Alternate forms which eliminated the undesirable practice were developed and administered at the same time as the original form. Rewriting item stems so that they formed a complete sentence or question resulted in about 6 percent more students answering items correctly. Eliminating unnecessary material in item stems, however, had little effect on difficulty. KR₂₀ values were not appreciably different for the two versions of either test. Neither flaw was found to affect item discrimination indices noticeably. The absence of any substantial practice-by-achievement level interactions suggested little effect of the practices on the validity of the tests. 相似文献

19.

论网络教育与我国当前教育环境的契合

陈澜傅强《广西师范大学学报(哲学社会科学版)》2002,38(4):75-78

从网络教育的“资源、对象、管理”三方面看 ,网络教育尚未与我国当前教育环境实现很好的契合 ,因此 ,应采取以下措施 :(一 )在硬件支撑环境上 ,离线建设与在线建设相结合 ;(二 )在教学资源建设上 ,认知型教学和建构型教学的指导思想相结合 ;(三 )在教学形式上 ,远程式和传统式教学相结合 ;(四 )在教学组织上 ,自主式学习与集中式学习相结合 ;(五 )在教学管理上 ,自主式管理与集中式管理相结合 ;(六 )在学业评价上 ,专业素质考核与课程考试相结合相似文献

20.

The Development and Validation of a Formula for Measuring Single-Sentence Test Item Readability

Susan Homan Margaret Hewitt Jean Linder 《Journal of Educational Measurement》1994,31(4):349-358

This study describes the development and validation of the Homan-Hewitt Readability Formula. This formula estimates the readability level of single-sentence test items. Its initial development is based on the assumption that differences in readability level will affect item difficulty. The validation of the formula is achieved by (a) estimating the readability levels of sets of test items predicted to be written at 2nd- through 8th-grade levels; (b) administering the tests to 782 students in grades 2 through 5; (3) using the class means as the unit of analyses and subjecting the data to a two-factor repeated measures ANOVA. Significant differences were found on class mean performance scores across the levels of readability. These results indicated that a relationship exists between students'reading grade levels and their responses to test items written at higher readability levels. 相似文献