首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 984 毫秒
1.
Cognitive diagnosis models (CDMs) continue to generate interest among researchers and practitioners because they can provide diagnostic information relevant to classroom instruction and student learning. However, its modeling component has outpaced its complementary component??test construction. Thus, most applications of cognitive diagnosis modeling involve retrofitting of CDMs to assessments constructed using classical test theory (CTT) or item response theory (IRT). This study explores the relationship between item statistics used in the CTT, IRT, and CDM frameworks using such an assessment, specifically a large-scale mathematics assessment. Furthermore, by highlighting differences between tests with varying levels of diagnosticity using a measure of item discrimination from a CDM approach, this study empirically uncovers some important CTT and IRT item characteristics. These results can be used to formulate practical guidelines in using IRT- or CTT-constructed assessments for cognitive diagnosis purposes.  相似文献   

2.
Instructional sensitivity is the psychometric capacity of tests or single items of capturing effects of classroom instruction. Yet, current item sensitivity measures’ relationship to (a) actual instruction and (b) overall test sensitivity is rather unclear. The present study aims at closing these gaps by investigating test and item sensitivity to teaching quality, reanalyzing data from a quasi-experimental intervention study in primary school science education (1026 students, 53 classes, Mage = 8.79 years, SDage = 0.49, 50% female). We examine (a) the correlation of item sensitivity measures and the potential for cognitive activation in class and (b) consequences for test score interpretation when assembling tests from items varying in their degree of sensitivity to cognitive activation. Our study (a) provides validity evidence that item sensitivity measures may be related to actual classroom instruction and (b) points out that inferences on teaching drawn from test scores may vary due to test composition.  相似文献   

3.
APPLICATION OF COMPUTERIZED ADAPTIVE TESTING TO EDUCATIONAL PROBLEMS   总被引:1,自引:0,他引:1  
Three applications of computerized adaptive testing (CAT) to help solve problems encountered in educational settings are described and discussed. Each of these applications makes use of item response theory to select test questions from an item pool to estimate a student's achievement level and its precision. These estimates may then be used in conjunction with certain testing strategies to facilitate certain educational decisions. The three applications considered are (a) adaptive mastery testing for determining whether or not a student has mastered a particular content area, (b) adaptive grading for assigning grades to students, and (c) adaptive self-referenced testing for estimating change in a student's achievement level. Differences between currently used classroom procedures and these CAT procedures are discussed. For the adaptive mastery testing procedure, evidence from a series of studies comparing conventional and adaptive testing procedures is presented showing that the adaptive procedure results in more accurate mastery classifications than do conventional mastery tests, while using fewer test questions.  相似文献   

4.
Large-scale assessments of student competencies address rather broad constructs and use parsimonious, unidimensional measurement models. Differential item functioning (DIF) in certain subpopulations usually has been interpreted as error or bias. Recent work in educational measurement, however, assumes that DIF reflects the multidimensionality that is inherent in broad competency constructs and leads to differential achievement profiles. Thus, DIF parameters can be used to identify the relative strengths and weaknesses of certain student subpopulations. The present paper explores profiles of mathematical competencies in upper secondary students from six countries (Austria, France, Germany, Sweden, Switzerland, the US). DIF analyses are combined with analyses of the cognitive demands of test items based on psychological conceptualisations of mathematical problem solving. Experts judged the cognitive demands of TIMSS test items, and these demand ratings were correlated with DIF parameters. We expected that cultural framings and instructional traditions would lead to specific aspects of mathematical problem solving being fostered in classroom instruction, which should be reflected in differential item functioning in international comparative assessments. Results for the TIMSS mathematics test were in line with expectations about cultural and instructional traditions in mathematics education of the six countries.  相似文献   

5.
The No Child Left Behind (NCLB) legislation has created pressure for districts to improve their students’ proficiency levels on state tests. Districts that fail to meet their academic targets for 3 years must use their Title I funds to pay for supplemental education services (SES) that provide tutoring or other academic instruction. Many districts, including the Pittsburgh Public Schools (PPS), have also adopted additional tutoring programs designed to help students reach proficiency goals. This paper examines student participation and achievement in two PPS tutoring programs—the NCLB-mandated SES program and a state-developed tutoring program. We examine the characteristics of students participating in each program, the effects of participation on student achievement, and the program features that are associated with improved achievement.  相似文献   

6.
This study investigated classroom practices of 38 teachers enrolled in university masters' degree programs in educational technology and in other areas of education. The classroom practices related to five key concepts associated with educational technology: (a) learner-centered instruction, (b) instructional design, (c) media and technology, (d) assessment, and (e) instructional alignment. Teachers rated their frequency of use of desirable practices in these five areas on a 30-item Likert type survey. In addition, one class of students per teacher rated its own teacher's frequency of use of the practices on 20 items parallel to items on the teacher survey. The mean overall rating across all teachers for the classroom practice items was very close to Often, or 4.0, on the 5-point scale. There were few reported differences between the teachers enrolled in educational technology programs and those enrolled in other education programs. Student ratings indicated less frequent teacher use of the desirable practices on 16 of the 20 common items, with significantly lower student ratings on 8 of these items. However, there was strong teacher-student agreement on several other comparisons.The study reported in this article was conducted as a doctoral dissertation at Arizona State University.  相似文献   

7.
The answer-until-correct (AUC) method of multiple-choice (MC) testing involves test respondents making selections until the keyed answer is identified. Despite attendant benefits that include improved learning, broad student adoption, and facile administration of partial credit, the use of AUC methods for classroom testing has been extremely limited. This study presents scoring properties and item analysis for 26 AUC university course examinations, administered using a commercial scratch-card response system. Here, we show that beyond the traditional pedagogical advantages of AUC, the availability of partial credit adds psychometric advantages by boosting both the mean item discrimination and overall test-score reliability, when compared to tests scored dichotomously upon initial response. Furthermore we also find a strong correlation between students’ initial-response successes and the likelihood that they would obtain partial credit when they make incorrect initial responses. Thus, partial credit is being granted based on partial knowledge that remains latent in traditional MC tests. The fact that these advantages are realized in real-life classroom tests may motivate further expansion of the use of AUC MC tests in higher education.  相似文献   

8.
Students tend to comprehend little and lose focus of classroom instruction when their teachers fail to use instructional strategies that match students’ learning styles. Differentiated instruction can alleviate or eliminate this disengagement. This article describes a case involving a child having difficulty learning and shows how differentiated instruction was used to help this student learn. The author describes the theories on which differentiated instruction is based and provides practical strategies teachers can use to implement this method of teaching.  相似文献   

9.
Under a grant from Education Research and Development Committee, researchers at Macquarie University, Sydney, Australia, developed a set of instructional materials aimed at the inservice education of teachers on the topic of student assessment. The Student Assessment Project (SAP) now comprises seven modules in slide‐tape format covering the topics of test design, item writing, analysis of norm‐referenced and criterion‐referenced tests, combining scores from different components, moderation of test results, and grading and reporting. The kit also contains appropriate computer software (for an Apple II microcomputer), manuals and supplementary materials. This article gives some details of the project and its development and describes the widespread use of the first 20 kits from which evaluation data are being sought. Although SAP originally focussed on inservice education of secondary teachers, the present applicability to higher education and the further potential for modification and use at this level is discussed.  相似文献   

10.
In this paper, the authors draw on recent international experience to assess the success of five propositions for how high stakes national testing can improve classroom instruction and, ultimately, raise student achievement. Findings indicate that testing can be an effective mechanism for improving instructional practice, but its success is not ensured. It has failed as often as it has succeeded, usually because those implementing the strategy failed to understand the intermediate conditions that had to be met for changes in test content, format, or use to have the desired impact on teachers' classroom practice.  相似文献   

11.
This paper aims to examine current nationwide youth fitness test programs, address problems embedded in the programs, and possible solutions. The current Fitnessgram, President's Challenge, and YMCA youth fitness test programs were selected to represent nationwide youth fitness test programs. Sponsors of the nationwide youth fitness test programs need to (a) carefully examine the efficacy of youth fitness test batteries in promoting student healthrelated fitness, (b) increase the accountability of youth fitness testing, (c) add a written test on student fitness knowledge to the fitness test programs, and (d) select and develop more efficient test items in each test component.  相似文献   

12.
In this study, the relationship between differentiated instruction, as an element of data-based decision making, and student achievement was examined. Classroom observations (n = 144) were used to measure teachers’ differentiated instruction practices and to predict the mathematical achievement of 2nd- and 5th-grade students (n = 953). The analysis of classroom observation data was based on a combination of generalizability theory and item response theory, and student achievement effects were determined by means of multilevel analysis. No significant positive effects were found for differentiated instruction practices. Furthermore, findings showed that students in low-ability groups profited less from differentiated instruction than students in average or high-ability groups. Nevertheless, the findings, data collection, and data-analysis procedures of this study contribute to the study of classroom observation and the measurement of differentiated instruction.  相似文献   

13.
Promoting self‐determination has been suggested as a means for students with disabilities to access the general curriculum. We surveyed 407 elementary educators to examine a) the effects of classroom setting and teaching self‐regulation strategies on the perceived importance and frequency of teaching self‐determination; and b) the severity level of student disability, teacher primary assignment, teaching experience, and classroom and school setting on self‐regulation instruction. Teaching experience and classroom setting predicted the use of self‐regulation strategies, but primary assignment, school setting, and student disability did not. Self‐regulation instruction predicted the frequency of teaching self‐determination, but neither it nor classroom setting predicted the perceived importance of teaching self‐determination. Limitations and implications of this study are discussed, and suggestions for future research are offered.  相似文献   

14.
Multiple threats to validity and reliability exist when value-added models (VAMs) rely wholly on standardized assessments to measure the relationship between teachers and their K–12 students’ learning gains. Research on a curriculum-based VAM, built on evidence-based practices, continues to establish an explicit link between teacher candidates’ instruction and their K–12 students’ learning gains. Statistical tests of association were used to analyze measures of student learning and university supervisors’ ratings during classroom observations with a department instrument, the Narrative Observation Scale. Results from a sample of 23 teacher candidates revealed that (a) two measures of student learning were related and attributed to candidates’ instruction, and (b) 67.6% of the variance in the percentage of K–12 students meeting their specific learning objectives was accounted for by the teacher candidates’ mastery of specific classroom management behaviors. Limitations and directions for future research are discussed regarding continued efforts to refine a rational, curriculum-based VAM.  相似文献   

15.
Teacher preparation programs have both a desire and a responsibility to demonstrate, with affirmative evidence, that teacher education makes a difference in PreK–12 student learning. Program faculty need good data to make decisions about the progress of students, whom to recommend for state licensure, and how to improve teacher education. This article describes an American Psychological Association task force report that discusses 3 measures of program effectiveness that have potential for both informing the public and providing useful data for programs to continuously improve: (a) outcome data from PreK–12 student academic growth as assessed by standardized tests; (b) teacher performance as evaluated by valid and reliable observational instruments; and (c) judgments of graduates, their PreK–12 students, and those who hire teachers as gauged by surveys. Although no technique of data collection and analysis is perfect, this report provides directions for teacher educators who seek to continuously improve their programs.  相似文献   

16.
This paper describes the process for creating and validating an assessment test that measures the effectiveness of instruction by probing how well that instruction causes students in a class to think like experts about specific areas of science. The design principles and process are laid out and it is shown how these align with professional standards that have been established for educational and psychological testing and the elements of assessment called for in a recent National Research Council study on assessment. The importance of student interviews for creating and validating the test is emphasized, and the appropriate interview procedures are presented. The relevance and use of standard psychometric statistical tests are discussed. Additionally, techniques for effective test administration are presented.  相似文献   

17.
Classroom teachers are in the front line of introducing students to formal learning, including assessments, which can be assumed to continue for students should they extend their schooling past the expected mandatory 12 years. The purpose of the present investigation was to survey secondary teachers’ beliefs of classroom and large‐scale tests for (a) providing information about students’ learning processes, (b) influencing meaningful student learning, and (c) eliciting learning or test‐taking strategies for successful test performance. Secondary teachers were surveyed because a majority of large‐scale tests are developed for secondary students (e.g., PISA, TIMSS). Results suggested that in comparison to large‐scale tests teachers believe classroom tests provide more information about student learning processes, are more likely to influence meaningful student learning, and are more likely to require learning over test‐taking strategies. The implications of these results for assessment literacy are explored.  相似文献   

18.
The paper describes the development and validation of a group test of integrated process skills. The test assesses student performance on a set of twelve objectives related to the generic objective: planning and conducting an investigation. Evidence of content validity, construct validity, and reliability are presented in the paper. A range of generalizability coefficients from 0.77 to 0.98 is reported for specific uses of the 24-item test. Since the items measure performance on objectives that can be readily translated into classroom activity, the test has direct applicability to classroom based research, and evaluation of instruction. In addition to sound psychometric properties, the Test of Integrated Science Processes is distincitve because it includes a set of interrelated, cumulative objectives which reflect autonomous problem solving.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号