首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 33 毫秒
1.
Two experiments were conducted to determine if a relationship exists between test item arrangements and student performance on power tests. The primary hypotheses were: item arrangements based upon item difficulty, similarity of content, or order of class presentation do not influence test score or required testing time. In the first experiment 122 subjects were randomly assigned to three item difficulty arrangements of 139 test items with a 0–100% difficulty range, and in the second experiment 156 subjects were randomly assigned to three item content arrangements of 103 items. Results of analyses of variance with test anxiety used as a classification factor supported the hypotheses.  相似文献   

2.
Ten rhesus monkeys were trained on five tasks, each of which consisted of eight concurrently presented object discrimination problems. Sequences of presentation were devised to allow one, two, or three new tasks to intervene between acquisition and retention tests or to provide a 30-day period of no testing. Equivalent and proficient performances were obtained in all retention tests, and no relationship was observed between retention and the initial preference characteristics of various objects. Object pReferences did produce significant influences upon acquisition, but these effects were not as pronounced in early tasks as in later ones. An additional retention test provided support for the contention that monkeys do not necessarily process information about specific object pair discriminations. Rather, they appeared to retain a list of previously rewarded objects even when object pairings were different from those provided during acquisition. Concurrent discriminations involving many distinct objects were resistant to interference and independent of preference characteristics over long retention periods.  相似文献   

3.
《教育实用测度》2013,26(4):393-406
Two models are presented in this article for estimating the proportion of students who would pass all of three or more content area tests given that none have actually been tested in more than two of the content areas. The first model allows one to estimate the proportion of students who would pass all of three or more content area tests from the test results of a study in which no student took more than two of the tests; the second model (which requires an outside estimate of the correlations between the different content area tests) allows one to estimate the proportion of students who would pass all of three or more content area tests from the test results of a study (or field test results) in which students took only one content area test. The models were tested on the Texas End-of-Course test battery (which consists of four content area tests) results of students who took all four content area tests prior to or in the spring of 2001, with at least one of the end-of-course content area tests taken in the spring of 2001. The model test results may have particular application to state assessment programs that must perform standard setting on high-stakes exams before the first live administration of the exams.  相似文献   

4.
Accountability in higher education has increased, with more institutions requiring standardized tests. These tests are high stakes for institutions, but low-stakes test for students, who seldom experience consequences for their performance. This study describes how one psychology department improved students' scores on the Psychology Area Concentration Achievement Test. Results were compared between three motivation conditions: no incentive, a monetary incentive, and a motivational Microsoft PowerPoint presentation. The presentation gave students information about the assessment, encouraged them to do well, and informed them that faculty would discuss scores while evaluating the psychology program. Results showed that test scores were significantly higher and correlated significantly with grade point average for students exposed to the motivational presentation. The motivational PowerPoint presentation seemed to have reduced the number of underachieving students and provided more accurate assessment data, with minimal investment in time and effort on the part of faculty.  相似文献   

5.
Using a sample of 908 eleventh grade science stream male and female students from similar socioeconomic area schools, variance based psychometric properties of three paper-and-pencil tests of logical thinking (Longeot test, Lawson's test TOFR, and Tobin and Capie's test TOLT) are investigated. A sub-sample of 212 students took the three tests in randomly allocated different sequential orders of presentation, while 696 students took only two tests. Alfa coefficients for each test separately and for the three tests combined together, concurrent validity coefficients, measures of item difficulty, item discrimination, item-criterion correlation, and 30-day stability coefficients are calculated. Considering the relative homogeneity of the sample, the reliability coefficients of the tests are judged satisfactory, but concurrent validity coefficients are quite low which implies incongruency in decisions made on the basis of the three tests. Need for estimating various psychometric parameters of alternative tests of logical thinking over different grade populations is emphasized.  相似文献   

6.
In order to assess the abilities of two California sea lions to generalize an identity concept, both animals were taught a two-choice, visual matching-to-sample task. We hypothesized that initial identity-matching problems would be learned as conditional (if...then) discriminations but that an identity concept would emerge after training numerous exemplars of identity matching. After training with 15 two-stimulus identity matching-to-sample problems, transfer tests consisting of 15 novel problems were given to the animals. Pass-fail criteria were defined in terms of performance on Trial 1 of each test problem, performance on test trials compared with baseline trials, and performance on four-trial problem blocks. One sea lion passed on the second transfer test and the other passed on the third; both demonstrated successful generalization of an identity concept by all criteria used. A second experiment consisted of presentation of stimuli previously learned in a different context (arbitrary matching-to-sample). Both subjects immediately applied an identity concept to accurately solve these new problems. These tests conclusively demonstrate transfer of an identity matching rule in California sea lions.  相似文献   

7.
This paper reports the results of cloze tests in the reading and listening modes together with a computer analysis of responses to the tests. The subjects were groups of Scottish school children at the ages of 8-9, 11-12 and 13–14 years sampled over the whole country as part of a national survey of English language; the cloze tests were only a small part of the whole testing programme which also contained three other major reading tests. Approximately 400 subjects took cloze tests in each mode at each age. The test material was the same throughout for all stages tested. Two tests, each containing one narrative and one expository text were used. The mode of presentation did not significantly affect the types of cloze responses offered nor the total scores of the tests at any stage. However, results indicated better performance for older subjects when they read, and for the youngest group when they listened to, expository though not narrative passages. The comparisons of the results for the three different school stages showed continuing interdependence of reading and listening ability through the ages tested. The different cloze response patterns for the two types of text (in either mode) as well as the only moderate correlation between the texts, indicated that success in comprehending narratives may not necessarily transfer to comprehending information.  相似文献   

8.
Research on expertise suggests that a critical aspect of expert understanding is knowledge of the relations between domain principles and problem features. We investigated two instructional pathways hypothesized to facilitate students’ learning of these relations when studying worked examples. The first path is through self-explaining how worked examples instantiate domain principles and the second is through analogical comparison of worked examples. We compared both of these pathways to a third instructional path where students read worked examples and solved practice problems. Students in an introductory physics class were randomly assigned to one of three worked example conditions (reading, self-explanation, or analogy) when learning about rotational kinematics and then completed a set of problem solving and conceptual tests that measured near, intermediate, and far transfer. Students in the reading and self-explanation groups performed better than the analogy group on near transfer problems solved during the learning activities. However, this problem solving advantage was short lived as all three groups performed similarly on two intermediate transfer problems given at test. On the far transfer test, the self-explanation and analogy groups performed better than the reading group. These results are consistent with the idea that self-explanation and analogical comparison can facilitate conceptual learning without decrements to problem solving skills relative to a more traditional type of instruction in a classroom setting.  相似文献   

9.
Taking a test on a passage one has just studied is known to enhance later retention of the passing contents. This study examined the effects of three types of initial test on later retention: a short-answer test, a multiple-choice test, and a full free-recall test. Questions on the first two of these tests covered only half of the passage contents. Later retention was compared for both initially tested content and un-tested content with that of a control group not initially tested on the passage at all. The subjects were 57 secondary school students who studied a brief history text before taking one of the initial tests. All were given retention tests 2 weeks later. The classical testing effect (enhanced retention due to initial testing) was shown to be influenced by the type of initial test used. Thus, a testing effect was evident in the case of the initial short-answer test, but not in the case of either of the other two tests. A depth-of-processing view is advanced in interpreting this finding. The testing effect was found not to generalize to untested content and in one condition (the initial multiple-choice test), retention of untested content was depressed.  相似文献   

10.
This study investigated two procedures for estimating the population standard deviation of nonnormed tests. Two normed tests, both whose population standard deviation was known, were administered to 272 students in grades 3–6. One of the normed tests was treated as a criterion-referenced test; the two variance estimation procedures were applied to the scores from this test. Substantial differences were found between both estimated statistics and the actual standard deviation. The first estimation procedure estimated the standard deviation systematically higher, whereas the second procedure's estimation was systematically lower. These results are discussed in terms of using such procedures for program evaluation.  相似文献   

11.
Hypothesized cognitive strengths and weaknesses of three dyslexic subgroups (Boder and Jarrico 1982) were examined in two reading related experiments. The first experiment tested the prediction that auditorily presented letter sets should be processed better by dyseidetic than by dysphonetic readers. The prediction was not confirmed. The results did not show any modality of presentation-specific recall differences between the three dyslexic subgroups. Overall, dyslexic children’s scores were significantly lower than those of age-matched control groups. The second experiment tested predictions of differential performance of dyseidetic and dysphonetic readers in a task in which the name identity of letters in pairs had to be indicated. Predicted patterns were not confirmed. Compared to the control groups all three dyslexic subgroups (whose means did not differ significantly) made significantly more errors in the condition in which it was essential to activate phonetic representations of the letters. The experimental results of this study suggest a greater similarity in the nature of letter processing problems in dyslexic children than is assumed in Boder and Jarrico’s (1982) subtyping test. This research was supported by grant 634301 from the Department of Special Education, State University of Groningen, and a travel grant from the Netherlands Organization for the Advancement of Pure Research (Z.W.O.). Based on a presentation at the 32nd Annual Conference of The Orton Dyslexia Society, Baltimore, Maryland, November 1982.  相似文献   

12.
本文是第一篇探索斯坦福成就阅读考试(第十版)的原本及其客户化版本的结构相似性的文章。研究分析是跨年级在多个观测变量(个别题目,题组,题包)上进行的。分析方法主要包括线性和非线性的探索性和实证性因素分析。分析结果表明在所有文章内的试题,都有不同程度的题组效应。在所有的模型当中,个别题目作为观测变量的模型的拟合度最低,题组作为观测变量的模型的拟合;其次,题包作为观测变量的模型的拟合度最高。在三种结构等性等级:同性等性(congenric),陶性等性(tau-equivalent)和并行等性(parallel)中,斯坦福成就阅读考试原本与其客户化版本的结构具有同性相似。  相似文献   

13.
《教育实用测度》2013,26(1):21-36
The purpose of this study was to describe the nature and quality of the chapter-end tests that accompany textbook series used in the elementary and middle school grades. In the first phase of the study, three to five tests from each of Grades 3, 5, and 7 and from each of five social studies series were evaluated. In the second phase, three to four tests from each of Grades 2, 4, and 6 and from each of five science series were evaluated. For the typical test in both subject areas, there were 21 items, mostly multiple choice and/or matching. About half of the chapter objectives were measured by its items, and about half of its items matched one of the chapter objectives. In addition, about two thirds of the items in a test were phrase matches with the textbook sentences or phrases, and about 90% of the items were classified at the knowledge level. We recommend that teachers not use these tests intact for classroom assessment.  相似文献   

14.
The psychometric test results of a sample of 100 LD students with severe achievement problems were cluster analyzed. The variables included in this analysis were the subtests of the WISC-R, the Bender Gestalt, the Benton Visual Retention, the Purdue Perceptual-Motor, and the Lindamood Auditory Conceptualization tests. Using K-means iterative clustering procedures, three clusters were obtained. The first cluster was defined by low scores on attention and concentration subtests; the second was defined by low scores on subtests of verbal-associative intelligence; the third was defined by low scores on visual-spatial and motoric subtests. Limitations of the study, in the scope of the psychometric testing and the lack of pediatric and neurologic diagnoses, are discussed.  相似文献   

15.
In three experiments, we assessed the role of signals for changes in the consequences of cues as a potential account of the renewal effect. Experiment 1 showed recovery of responding following extinction when acquisition, extinction, and test phases occurred in different contexts. In addition, extinction treatment in multiple contexts attenuated context-induced response recovery. In Experiment 2, we used presentations of an extraneous stimulus (ES), instead of context shifts, and found that responding recovered from extinction only when the ES was presented both between acquisition and extinction and between extinction and test. In Experiment 3, we used a reversal learning design in which, during training, two cues were first paired with different outcomes, then paired with the alternative outcomes, and finally paired again with the original outcomes. In this experiment, presentation, just prior to testing, of an ES that had previously been presented between the different phases produced an expectation of reversal in the meaning of the cues.  相似文献   

16.
This study was conducted to determine which skills and concepts students have that are prerequisites for solving moles problems through the use of analog tasks. Two analogous tests with four forms of each were prepared that corresponded to a conventional moles test. The analogs used were oranges and granules of sugar. Slight variations between test items on various forms permitted comparisons that would indicate specific conceptual and mathematical difficulties that students might have in solving moles problems. Different forms of the two tests were randomly assigned to 332 high school chemistry students of five teachers in four schools in central Indiana. Comparisons of total test score, subtest scores, and the number of students answering an item correctly using appropriate t-test and chi square tests resulted in the following conclusions: (1) the size of the object makes no difference in the problem difficulty; (2) students understand the concepts of mass, volume, and particles equally well; (3) problems requiring two steps are harder than those requiring one step; (4) problems involving scientific notation are more difficult than those that do not; (5) problems involving the multiplication concept are easier than those involving the division concept; (6) problems involving the collective word “bag” are easier to solve than those using the word “billion”; (7) the use of the word “a(n)” makes the problem more difficult than using the number “1”.  相似文献   

17.
Test-based accountability often produces score inflation. Most studies have evaluated inflation by comparing trends on a high-stakes test and a lower stakes audit test. However, Koretz and Beguin (2010) noted weaknesses of audit tests and suggested self-monitoring assessments (SMAs), which incorporate audit items into high-stakes tests. This article reports the first three trials of SMAs, evaluating whether SMAs can detect inflation that had already been documented. The studies were conducted with mathematics tests in three grades. Despite severe conservative biases, the audit component functioned as expected in many of the trials. The difference in performance between nonaudit and audit items was associated with factors that earlier research showed to be related to test preparation and score inflation, such as scoring just below the Proficient cut in the previous year and school poverty. However, a number of null findings underscore the need for additional research into the design of audit items.  相似文献   

18.
19.
This study tested the hypothesis that teaching concepts in high school economics first in a familiar mode or symbol system and then elaborating on them in a second or less familiar mode facilitates classroom learning. In an experimental design, 83 high school seniors were individually assigned at random to three classes, which in turn were randomly assigned to three different classroom instructional treatments, each having a duration of 10 hours and taught by the students' regular economics teachers. It was predicted and found that comprehension of economics is facilitated by a teaching strategy that initially presents the concepts in a familiar verbal mode and then presents them in a more abstract mode using graphs or other instructive imagery. This strategy compared favorably with two alternative procedures, one presenting the same content first in graphs and then verbally (p<0.001), and the other using only one mode of presentation (p<0.01). These results imply that the type and order of presentation of symbol systems influence the learning of concepts in high school economics classes by facilitating or interfering with the generation of relations between prior knowledge and new information. The results imply that presenting economics concepts in two symbol systems, rather than one, facilitates learning, provided, contrary to customary teaching methods, that the teacher uses the familiar verbal presentation first and follows it with an integrative but less familiar graphic presentation.  相似文献   

20.
Learners are usually provided with support devices because they find it difficult to learn from multimedia presentations. A key question, with no clear answer so far, is how best to present these support devices. One possibility is to insert them intothe multimedia presentation (canned support), while another is to have a human agent provide them (human tutoring). Human tutoring poses potential advantages: it uses spoken modality, displays non-verbal cues and implies social interaction. However, there is mixed evidence regarding these supposed advantages, and prior research comparing human and computer support presents problems. Our goal was to explore whether the advantages of human tutoring actually exist while avoiding the problems of prior research. In one experiment, participants learned Geology from a multimedia presentation including one of three forms of support: human tutoring, canned support or no support. After viewing the presentation, participants solved retention and transfer tests. Results revealed that participants in the human tutoring condition outperformed those in the other two conditions, who did not differ from each other. This means that human tutoring is advantageous, a fact that has implications in the design of support devices in multimedia learning.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号