首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Two experiments were conducted to determine if a relationship exists between test item arrangements and student performance on power tests. The primary hypotheses were: item arrangements based upon item difficulty, similarity of content, or order of class presentation do not influence test score or required testing time. In the first experiment 122 subjects were randomly assigned to three item difficulty arrangements of 139 test items with a 0–100% difficulty range, and in the second experiment 156 subjects were randomly assigned to three item content arrangements of 103 items. Results of analyses of variance with test anxiety used as a classification factor supported the hypotheses.  相似文献   

2.
3.
Although many have rejected classical test construction and analysis procedures for criterion-referenced tests, the present study was concerned with the possibility that classical procedures are both applicable and appropriate when samples of both mastery and nonmastery examinees are employed. A rationale for using these samples was presented, and empirical evidence was gathered which supported the practice of combining samples to increase the variance of test scores and thereby permit the proper estimate of reliability and item validities.  相似文献   

4.
5.
6.
7.
Described are the effects of four sets of instructions on the observed item inter- correlations of current events and subtraction items. The four conditions were: (a) general objective, (b) behavioral objective, (c) behavioral objective plus test item, and (d) behavioral objective plus item-form. Two tests, one in each subject matter, constructed by selecting four items generated from each of the experimental conditions, were administered to 51 seventh grade children. Not found were the expected tendencies toward greater homogeneity among items produced under the three conditions employing behavioral objectives.  相似文献   

8.
9.
One of the major assumptions of item response theory (IRT)models is that performance on a set of items is unidimensional, that is, the probability of successful performance by examinees on a set of items can be modeled by a mathematical model that has only one ability parameter. In practice, this strong assumption is likely to be violated. An important pragmatic question to consider is: What are the consequences of these violations? In this research, evidence is provided of violations of unidimensionality on the verbal scale of the GRE Aptitude Test, and the impact of these violations on IRT equating is examined. Previous factor analytic research on the GRE Aptitude Test suggested that two verbal dimensions, discrete verbal (analogies, antonyms, and sentence completions)and reading comprehension, existed. Consequently, the present research involved two separate calibrations (homogeneous) of discrete verbal items and reading comprehension items as well as a single calibration (heterogeneous) of all verbal item types. Thus, each verbal item was calibrated twice and each examinee obtained three ability estimates: reading comprehension, discrete verbal, and all verbal. The comparability of ability estimates based on homogeneous calibrations (reading comprehension or discrete verbal) to each other and to the all-verbal ability estimates was examined. The effects of homogeneity of item calibration pool on estimates of item discrimination were also examined. Then the comparability of IRT equatings based on homogeneous and heterogeneous calibrations was assessed. The effects of calibration homogeneity on ability parameter estimates and discrimination parameter estimates are consistent with the existence of two highly correlated verbal dimensions. IRT equating results indicate that although violations of unidimensionality may have an impact on equating, the effect may not be substantial.  相似文献   

10.
This study examines the relationship between race and performance on two nationally standardized reading tests. The appropriate reading tests of the Iowa Test of Basic Skills and Metropolitan Achievement Battery were administered to all fourth and sixth-grade students in all elementary schools of an urban school district near New York City. Although white pupils earned higher scores than nonwhite pupils on both tests, the Metropolitan produced significantly greater differences between the races than the Iowa, at both grade levels. Factorial analysis of variance confirmed the statistical significance of these differences. Implications of Race X Test (suggesting S.E.S. X Test) interaction effects for program evaluation and instruction are briefly discussed.  相似文献   

11.
This study investigated the influence of age, sex, and socioeconomic status on the perception of participants in adult education that their participation is useful. Two forms of utility were postulated: instrumental and expressive. An instrument containing scales of perceived utility, needs, goals, time orientation, and enjoyment was administered to selected classes at various educational institutions in the Chicago metropolitan area and, for comparison, a class in Florida. The results permitted inferences that needs, goals, and time orientation partially determine perception that participation is instrumentally useful and that age, status, and femaleness tend to favor perception of expressive utility. The findings supported previous research indicating that adult educational participation is complex behavior involving more than subject matter interests and motivational orientations and opened a new line of attack on the problem.  相似文献   

12.
13.
The standardization method for assessing unexpected differential item performance or differential item functioning (DIF) is introduced. The principal findings of the first five studies that have used this approach on the Scholastic Aptitude Test are presented.  相似文献   

14.
The Raven Progressive Matrices (RPM) were administered to 408 individuals in 100 family groups. Subjects’ ages ranged from 8 to 60. Scores on all five subtests were highest in the 18‐26 age group, decreasing with age. Males scored higher on each subtest in each age group. Performance on the RPM increased with additional years of education. Within each educational level, performance declined with age. Although decline with age appears to be invariant with education, changes in schools and educational methods may be factors operating in addition to aging.  相似文献   

15.
16.
17.
18.
19.
20.
Noting the wide differences in verbal abilities of middle and lower class children, the investigators proposed that two groups of children, one from the lower class, one from the middle class, who achieve comparable total scores on a group intelligence test, would get their scores by successfully completing different sets of items. In the first study children were placed in social classes based on their fathers' occupations, following guidelines from the Warner scale. Middle class children were matched with lower class children on total Otis scores. No item-social class interaction was found. The study was repeated using the occupational categories of the Dictionary of Occupational Titles as a guide to social class standing. Again no item-social class interaction appeared. If two social class groups are equated on total intelligence scores, one social class sample appears to succeed on essentially the same test items as does the other social class sample. A given score on an intelligence test appears to represent the same skills for one social class as it does for another social class.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号