首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A central concern surrounding test-based accountability is that teachers may narrow teaching practices to improve test performance on a curriculum-based specific knowledge test rather than student learning more broadly. Two of the most common teaching practices that “teach to the test” are providing test-specific classwork and increasing the frequency with which students take practice tests. Whether such teaching practices improve student learning—both in terms of learning the content associated with a specific knowledge test as well as more general learning—is a largely unanswered question. To approach this question, this paper uses a student fixed effects approach to analyze the impact of these kinds of narrow teaching practices on student performance on a specific test as well as a general knowledge test. We find that test-specific classwork and practice tests with specific test items tend to have little or negative impacts on curriculum specific or general knowledge test performance, except for male students, and that subject practice tests (without emphasizing test-specific items) have positive effects on student outcomes on both kinds of tests, but larger on the curriculum-specific than on the general test, and much larger on the curriculum-specific test for male students. We discuss the logic for these results and what they tell us about the effectiveness of test-focused teaching practices more generally.  相似文献   

2.
Memory for everyday information in students with learning disabilities   总被引:6,自引:0,他引:6  
This study compared students with and without learning disabilities (LD) on their recall of academic information and information encountered in the students' everyday lives. The academic recall measures included a sentence listening span test, a rhyming words working memory test, and a visual matrix working memory task. Students' cued recall of all the tasks was also measured. The everyday working memory tasks included a dance episode event recall test; a library procedure recall test; and recall tests of commonly found objects, such as a coin, a telephone, and a McDonald's sign. Compared to students without LD, students with LD performed poorly on both the academic recall tasks and the everyday recall tasks. These results support the notion that some students with LD may have working memory problems that affect their performance on tasks other than reading. The results of the cued recall showed that the availability of cues significantly decreased the ability group differences on many of the academic and everyday tasks. This result replicates prior research findings that students with LD do not use retrieval strategies effectively and that some students with LD may have a production deficiency that affects their retrieval of previously encoded information.  相似文献   

3.
Although test scores from similar tests in multiple choice and constructed response formats are highly correlated, equivalence in rankings may mask differences in substantive strategy use. The author used an experimental design and participant think-alouds to explore cognitive processes in mathematical problem solving among undergraduate examinees (N = 64). The study examined the effect of format on mathematics performance and strategy use for male and female examinees given stem-equivalent items. A statistically significant main effect of format on performance was found, with constructed-response items more difficult. The multiple-choice format was associated with more varied strategies, backward strategies, and guessing. Format was found to moderate the effect of problem conceptualization on performance. Results suggest that while for purposes of ranking students on performance, the multiple-choice format may be adequate, for many contemporary educational purposes that seek to provide nuanced information about student cognition, the constructed response format should be preferred.  相似文献   

4.
Interest in measuring and evaluating student learning in higher education is growing. There are many tools available to assess student learning. However, the use of such tools may be more or less appropriate under various conditions. This study provides some evidence related to the appropriate use of pre/post‐tests. The question of whether graded tests elicit a higher level of performance (better representation of actual learning gains) than ungraded post‐tests is examined. We examine whether the difficulty level of the questions asked (knowledge/comprehension vs. analysis/application) affects this difference. We test whether the student’s level in the degree programme affects this difference. Results indicate that post‐tests may not demonstrate the full level of student mastery of learning objectives and that both the difficulty level of the questions asked and the level of students in their degree programme affect the difference between graded and ungraded assessments. Some of these differences may be due to causes other than grades on the assessments. Students may have benefited from the post‐test, as a review of the material, or from additional studying between the post‐test and the final examination. Results also indicate that pre‐tests can be useful in identifying appropriate changes in course materials over time.  相似文献   

5.
Linguistic complexity of test items is one test format element that has been studied in the context of struggling readers and their participation in paper-and-pencil tests. The present article presents findings from an exploratory study on the potential relationship between linguistic complexity and test performance for deaf readers. A total of 64 students completed 52 multiple-choice items, 32 in mathematics and 20 in reading. These items were coded for linguistic complexity components of vocabulary, syntax, and discourse. Mathematics items had higher linguistic complexity ratings than reading items, but there were no significant relationships between item linguistic complexity scores and student performance on the test items. The discussion addresses issues related to the subject area, student proficiency levels in the test content, factors to look for in determining a "linguistic complexity effect," and areas for further research in test item development and deaf students.  相似文献   

6.
Students of six classes who had previously participated in a larger study of teaching styles were tested a year after completion of the course. The purpose of the follow-up was to determine whether or not the prior battery of tests, including a student evaluation of instructor form, the Introductory Psychology Criteria Test, an Attitude Toward Psychology Scale, and a knowledge test, administered in a large group setting independent of those used for grades by instructors, would be positively related to student performance on comparable tests given a year later. The follow-up measures included items from the above measures plus questions regarding experiences and readings related to psychology in the year since the students' introductory course and two brief experiments which the students were to critique. Results indicated that the end of the semester measures of teaching effectiveness in terms of student performance and attitudes were positively related to similar responses obtained a year later.  相似文献   

7.
为了获取大量的语言学研究信息,在线文献检索模式的掌握至关重要,有时直接关涉语言学文献检索的效果。语言学文献检索的目的是获取所需的语言学研究信息,而文献检索编辑的目的是提供多种信息,而非单一的语言学研究文献信息。这种接受与给予之间的信息关联取决于检索者和编辑者之间的信息认知协调。从语言学关键词检索来看,检索者和编辑者的知识框架的认知偏离极易导致漏检、误检等诸多检索失误现象。如此检索失误有其更为深层次的认知动因,这无疑导致二者有关语言学典型信息的认知解析的差异。本研究指出,作为权势一方的编辑者应该关注弱势一方的检索者的认知取向,做到熟悉常规方式,力求统一检索模式,发展智能手段,从而达成双方的认知协调,进而实现语言学文献检索的最佳效果。  相似文献   

8.
Federal policy on alternate assessment based on modified academic achievement standards (AA-MAS) inspired this research. Specifically, an experimental study was conducted to determine whether tests composed of modified items would have the same level of reliability as tests composed of original items, and whether these modified items helped reduce the performance gap between AA-MAS eligible and ineligible students. Three groups of eighth-grade students (N?=?755) defined by eligibility and disability status took original and modified versions of reading and mathematics tests. In a third condition, the students were provided limited reading support along with the modified items. Changes in reliability across groups and conditions for both the reading and mathematics tests were determined to be minimal. Mean item difficulties within the Rasch model were shown to decrease more for students who would be eligible for the AA-MAS than for non-eligible groups, revealing evidence of differential boost. Exploratory analyses indicated that shortening the question stem may be a highly effective modification, and that adding graphics to reading items may be a poor modification.  相似文献   

9.
This study investigated whether changes in the working memory (WM) performance of readers with learning disabilities (LD) is related to a general or domain-specific system. The study compared readers with LD, chronologically age-matched (CA-M), and reading level-matched (RL-M) children's WM performance for phonological, visual-spatial, and semantic information under initial (no probes or cues), gain (cues that bring performance to an asymptotic level), and maintenance (asymptotic conditions without cues) conditions. The main findings indicated that (a) CA-M children were superior in performance to readers with LD across initial, gain, and maintenance conditions, (b) readers with LD showed less change (as reflected in effect size scores, slopes for the quadratic curve) on both visual-spatial and verbal (phonological and semantic) WM tasks across gain and maintenance conditions than the CA-matched children, and (c) the performance of readers with LD was superior to the RL-M children's performance on initial conditions, but inferior on gain and maintenance conditions. Taken together, the results suggest that a general system moderated the changes in retrieval of phonological, visual-spatial, and semantic information in readers with LD.  相似文献   

10.
Abstract

A series of 8 tests was administered to university students over 4 weeks for program assessment purposes. The stakes of these tests were low for students; they received course points based on test completion, not test performance. Tests were administered in a counterbalanced order across 2 administrations. Response time effort, a measure of the proportion of items on which solution behavior rather than rapid-guessing behavior was used, was higher when a test was administered in the 1st week. Test scores were also higher. Differences between Week 1 and Week 4 test scores decreased when the test was scored with an effort-moderated model that took into account whether the student used solution or rapid-guessing behavior. Differences further decreased when students who used rapid-guessing on 5 or more of the 30 items were filtered from the data set.  相似文献   

11.
Examined in this study were the effects of reducing anchor test length on student proficiency rates for 12 multiple‐choice tests administered in an annual, large‐scale, high‐stakes assessment. The anchor tests contained 15 items, 10 items, or five items. Five content representative samples of items were drawn at each anchor test length from a small universe of items in order to investigate the stability of equating results over anchor test samples. The operational tests were calibrated using the one‐parameter model and equated using the mean b‐value method. The findings indicated that student proficiency rates could display important variability over anchor test samples when 15 anchor items were used. Notable increases in this variability were found for some tests when shorter anchor tests were used. For these tests, some of the anchor items had parameters that changed somewhat in relative difficulty from one year to the next. It is recommended that anchor sets with more than 15 items be used to mitigate the instability in equating results due to anchor item sampling. Also, the optimal allocation method of stratified sampling should be evaluated as one means of improving the stability and precision of equating results.  相似文献   

12.
As an alternative to adaptation, tests may also be developed simultaneously in multiple languages. Although the items on such tests could vary substantially, scores from these tests may be used to make the same types of decisions about different groups of examinees. The ability to make such decisions is contingent upon setting performance standards for each exam that allow for comparable interpretations of test results. This article describes a standard setting process used for a multilingual high school literacy assessment constructed under these conditions. This methodology was designed to address the specific challenges presented by this testing program including maintaining equivalent expectations for performance across different student populations. The validity evidence collected to support the methodology and results is discussed along with recommendations for future practice.  相似文献   

13.
This study examined whether practice testing with short-answer (SA) items benefits learning over time compared to practice testing with multiple-choice (MC) items, and rereading the material. More specifically, the aim was to test the hypotheses of retrieval effort and transfer appropriate processing by comparing retention tests with respect to practice testing format. To adequately compare SA and MC items, the MC items were corrected for random guessing. With a within-group design, 54 students (mean age = 16 years) first read a short text, and took four practice tests containing all three formats (SA, MC and statements to read) with feedback provided after each part. The results showed that both MC and SA formats improved short- and long-term memory compared to rereading. More importantly, practice testing with SA items is more beneficial for learning and long-term retention, providing support for retrieval effort hypothesis. Using corrections for guessing and educational implications are discussed.  相似文献   

14.
《教育实用测度》2013,26(3):185-207
With increasing interest in educational accountability, test results are now expected to meet a diverse set of informational needs. But a norm-referenced test (NRT) cannot be expected to meet the simultaneous demands for both norm-referenced and curriculum-specific information. One possible solution, which is the focus of this article, is to customize the NRT. Customized tests may appear in any form. They may (a) add a few curriculum-specific items to the end of the NRT, (b) substitute locally constructed items for a few NRT items, (c) substitute a curriculum-specific test (CST) for the NRT, or (d) use equating methods to obtain predicted NRT scores from the CST scores. In this article, we describe the four main approaches to customized testing, address the validity of the uses and interpretations of customized test scores obtained from the four main approaches, and offer recommendations regarding the use of customized tests and the need for further research. Results indicate that customized testing can yield both valid normative and curriculum- specific information, when special conditions exist. But, there are also many threats to the validity of normative interpretations. Cautious application of customized testing is needed in order to avoid misleading inferences about student achievement.  相似文献   

15.
The retention performance following partial training (15 trials) of a brightness-discrimination avoidance task has been shown to fluctuate over time, with a drop in performance 1 h after training (Kamin effect), a long-term spontaneous improvement (LTSI) after 3 days, and long-term spontaneous forgetting after 21 days. The purpose of this paper was to determine if these time-dependent modulations of retention performance reflect time-dependent modifications in the organization of the attributes that constitute the memory trace. We studied the relative effectiveness of several pretest cuings on retention performance when they were delivered just before a retention test occurring 1 h, 3 days, or 21 days following initial training. Some cuing treatments were related to a particular training event (conditioned or unconditioned stimulus, experimental context). Compound cuings, composed of two or more training events, were also studied to test a possible additive effect between retrieval cues. To demonstrate time-dependent modifications in the memory trace, a differential effectiveness over time was expected for at least some retrieval cues. The results show that cuing may compensate for performance deficits (1 h or 21 days), but does not further enhance performance when spontaneously improved (3 days). No detectable additive effects of retrieval cues were obtained. However, these results provide information that suggests a time-dependent effectiveness for some cuing treatments. In the last experiment, we investigated this possibility by studying the effects of each of these treatments when delivered after either a 1-h or a 21-day retention interval. The results confirm a time-dependent decrease in effectiveness of a pretest exposure to the CS and an increase in effectiveness over time of a pretest exposure to the experimental context or to a well-ordered sequence of events. Such differential retrievability of a partially learned episode according to both the nature of the cues and the length of the retention interval suggests a time-dependent reorganization of memory attributes.  相似文献   

16.
Fifth-grade students studied a map of a fictitious island while twice listening to a related narrative containing target feature and nonfeature items. The students were cued by varying iconic and verbal stimuli in four map cue conditions; they received immediate and delayed tests to recall text items, map features, and feature locations. The students were also required to rate their confidence in each response. Students remembered more text features and were more confident of their responses when cued by icons plus labels and by icons only. Students in these groups also recalled more map features and their locations on a map reconstruction task. Memory for feature information and pictorial retrieval cues appeared to activate memory for nonfeature information contained in the text.  相似文献   

17.
The impact of retrieval practice on analogical-problem-solving performance was investigated using a complex, educationally relevant task. Participants studied a statistical hypothesis testing scenario and practiced recalling the material or repeatedly studied it. Participants then completed a final test either 5 minutes or 1 week later involving a novel hypothesis-testing scenario that shared an intermediate procedural strategy and superficial and structural similarity with the study scenario but that differed at a specific procedure level. When the final test was given after 5 minutes, no differences in performance were observed across conditions (d = 0.01). Crucially, on the delayed test, retrieval practice produced superior performance than did repeated studying (d = 0.81), whereby participants were better at applying learned knowledge to solve a novel problem.  相似文献   

18.
Assessment items are commonly field tested prior to operational use to observe statistical item properties such as difficulty. Item parameter estimates from field testing may be used to assign scores via pre-equating or computer adaptive designs. This study examined differences between item difficulty estimates based on field test and operational data and the relationship of such differences to item position changes and student proficiency estimates. Item position effects were observed for 20 assessments, with items in later positions tending to be more difficult. Moreover, field test estimates of item difficulty were biased slightly upward, which may indicate examinee knowledge of which items were being field tested. Nevertheless, errors in field test item difficulty estimates had negligible impacts on student proficiency estimates for most assessments. Caution is still warranted when using field test statistics for scoring, and testing programs should conduct investigations to determine whether the effects on scoring are inconsequential.  相似文献   

19.
The goal of this study was to investigate whether single executive function (EF) tests were predictive for learning performance in mainly young and middle‐aged adults. The tests measured shifting and updating. Processing speed was also measured. In an observational study, cognitive performance and learning performance were measured objectively in 851 adult students and analyzed using multiple linear regression. EFs and processing speed were measured via cognitive tests. Learning performance was evaluated after 14 months. The results show that updating performance is predictive for learning performance, with a small effect size, while shifting performance was not. This means that a single updating test has predictive value for learning performance acquired over a longer period of time. However, as the effect size is rather small, the test on its own does not serve as a proper selection tool for determining whether a student will be successful or not.  相似文献   

20.
文献检索课是一门实践性很强的方法课.分析信息时代大学生信息素质教育与文献检索课的关系与现状,并对如何加强高校文献检索课建设提出几点建议.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号