期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

THE EFFECT OF DIFFERENTIAL WEIGHTING OF INDIVIDUAL ITEM RESPONSES ON THE PREDICTIVE VALIDITY AND RELIABILITY OF AN APTITUDE TEST

DARRELL L. SABERS GORDON W. WHITE 《Journal of Educational Measurement》1969,6(2):93-96

An empirical investigation of the effect of choice weight scoring on predictive validity and reliability. Choice weight scoring refers to the procedure whereby different weights may be assigned to all the options of an item. Four groups of subjects were included in the experiment. Weights derived from each group were used to score tests for another group in order to assess the cross-validity of the weighted scoring. In no case did the increments in reliability and validity due to the weighted scoring exceed .03. 相似文献

2.

THE USE OF "NONE-OF-THESE" VERSUS HOMOGENEOUS ALTERNATIVES ON MULTIPLE-CHOICE TESTS: EXPERIMENTAL RELIABILITY AND VALIDITY COMPARISONS

Malcom L. Williamson Kenneth D. Hopkins 《Journal of Educational Measurement》1967,4(2):53-58

相似文献

3.

ON THE VALIDITY OF ESSAY TESTS OF ACHIEVEMENT1

William E. Coffman 《Journal of Educational Measurement》1966,3(2):151-156

相似文献

4.

INCREMENTAL RELIABILITY AND VALIDITY OF MULTIPLE-CHOICE TESTS WITH AN ANSWER-UNTIL-CORRECT PROCEDURE1

GERALD S. HANNA 《Journal of Educational Measurement》1975,12(3):175-178

相似文献

5.

EFFECTS OF EMPIRICAL OPTION WEIGHTING ON RELIABILITY AND VALIDITY OF AN ACADEMIC APTITUDE TEST1

RICHARD R. REILLY REX JACKSON 《Journal of Educational Measurement》1973,10(3):185-193

Item options of shortened forms of the GRE Verbal and Quantitative tests were empirically weighted by two variants of a method originally attributed to Guttman (1941). When compared with formula scores, it was found that tests scored with the empirical weights were more reliable but less valid when correlated with undergraduate GPA. A factor analysis revealed large increases in variance accounted for by the first factor. It was suggested that the weighting procedures used tended to capitalize on omitting behavior which, although a highly reliable tendency, may be invalid. 相似文献

6.

IV. RELIABILITY AND VALIDITY OF THE CDI INVENTORIES

《Monographs of the Society for Research in Child Development》1994,59(5):25-31

相似文献

7.

THE EFFECT OF SELECTED POOR ITEM-WRITING PRACTICES ON TEST DIFFICULTY, RELIABILITY AND VALIDITY

CYNTHIA BOARD DOUGLAS R. WHITNEY 《Journal of Educational Measurement》1972,9(3):225-233

Violations of four selected principles of writing multiple choice items were introduced into an undergraduate political science examination. Three of the four poor practices had no overall effect on test difficulty. A significant (α= .05) interaction effect between the poor practices and course achievement occurred for one of the four practices, with the poorer students generally gaining most from the poorly written items. KR 20 values were significantly lower for sets of items with the same flaws than for "good" versions of the items in three of four comparisons. The reductions in reliability were equivalent to those expected to result from shortening the test by 13 to 56 percent. Concurrent validity (correlation of experimental test scores with final examination scores) was significantly lower in two of four cases. The reductions in validity were equivalent to those expected to result from shortening the test by 56 to 83 percent. 相似文献

8.

IV. EVALUATING THE BREADTH AND DEPTH OF TRAINING EFFECTS WHEN CENTRAL CONCEPTUAL STRUCTURES ARE TAUGHT

《Monographs of the Society for Research in Child Development》1996,61(1-2):83-102

相似文献

9.

THE EFFECT OF DIFFERENTIAL OPTION WEIGHTING ON MULTIPLE-CHOICE OBJECTIVE TESTS1

GERRY F. HENDRICKSON 《Journal of Educational Measurement》1971,8(4):291-296

The purpose of this study was to determine in what way Guttman weighting affected the internal consistency and intercorrelation of the suhtests of the Scholastic Aptitude Test. The tests were first scored with Guttman weights and then with conventional correction-for-guessing weights. The internal consistency of the tests increased markedly when Guttman weights were used. The correlation of the two verbal subtests increased somewhat when Guttman weights were used, but the correlation of the two mathematics subtests as well as the intercorrelation of all verbal and mathematics subtests decreased. Differences in the factor structure of the Guttman- and conventionally-weighted subtests were used to explain the result. 相似文献

10.

A STUDY OF RELIABILITY AND VALIDITY EFFECTS OF TOTAL AND PARTIAL IMMEDIATE FEEDBACK IN MULTIPLE-CHOICE TESTING

GERALD S. HANNA 《Journal of Educational Measurement》1977,14(1):1-7

相似文献

11.

MULTIPLE PROCESSING STRATEGIES AND THE CONSTRUCT VALIDITY OF VERBAL REASONING TESTS

SUSAN EMBRETSON LISA M. SCHNEIDER DAVID L. ROTH 《Journal of Educational Measurement》1986,23(1):13-32

This study examines the influence of processing strategies, and the associated metacomponents that determine when to apply them, on the construct validity of a verbal reasoning test. Three strategies for solving verbal analogy items were examined: a rule-oriented strategy, an association strategy, and a partial rule strategy. Construct validity was studied in two separate stages: construct representation and nomothetic span. For construct representation, evidence was obtained that all three strategies, and their related metacomponents, are associated with performance on analogy items. For nomothetic span, the current study found that all three strategies contribute to individual differences in verbal reasoning and to the predictive validity of the test. The results of this study also point to the utility of metacomponents as constructs for describing and understanding test performance. Implications of the results for test development and theories of aptitude are elaborated. 相似文献

12.

ESTIMATING THE RELIABILITY, VALIDITY, AND INVALIDITY OF ESSAY RATINGS

H. BLOK 《Journal of Educational Measurement》1985,22(1):41-52

In an essay rating study multiple ratings may be obtained by having different raters judge essays or by having the same rater(s) repeat the judging of essays. An important question in the analysis of essay ratings is whether multiple ratings, however obtained, may be assumed to represent the same true scores. When different raters judge the same essays only once, it is impossible to answer this question. In this study 16 raters judged 105 essays on two occasions; hence, it was possible to test assumptions about true scores within the framework of linear structural equation models. It emerged that the ratings of a given rater on the two occasions represented the same true scores. However, the ratings of different raters did not represent the same true scores. The estimated intercorrelations of the true scores of different raters ranged from .415 to .910. Parameters of the best fitting model were used to compute coefficients of reliability, validity, and invalidity. The implications of these coefficients are discussed. 相似文献

13.

AN EMPIRICAL STUDY OF THE EFFECT OF THE CORRECTION FOR CHANCE SUCCESS ON THE RELIABILITY AND VALIDITY OF AN APTITUDE TEST

Darrell L. Sabers Leonard S. Feldt 《Journal of Educational Measurement》1968,5(3):251-258

相似文献

14.

THE LONG AND SHORT TERM PREDICTIVE EFFICIENCY OF TWO TESTS OF READING POTENTIAL

J. Miles P.J. Foreman J. Anderson 《International Journal of Disability, Development & Education》1973,20(3):131-141

A comparision was made of the predictive efficiency of each of two tests in the diagnosis of reading failure over a period of from one to four years. A direct test of reading potential in the form of a word recognition test was shown generally to be more efficient than an indirect test based on neurophysiological indicants. The finding that self concept measures were not consistently related to reading performance was interpreted in terms of the biassing effect of a particular response style. 相似文献

15.

THE INFLUENCE OF DIFFERENT STYLES OF TEXTBOOK USE ON INSTRUCTIONAL VALIDITY OF STANDARDIZED TESTS

DONALD J. FREEMAN GABRIELLA M. BELLI REW C. PORTER ROBERT E. FLODEN WILLIAM H. SCHMIDT JOHN R. SCHWILLE 《Journal of Educational Measurement》1983,20(3):259-270

相似文献

16.

RELIABILITY AND STRUCTURAL VALIDITY OF THE TEACHER RATING SCALES OF EARLY ACADEMIC COMPETENCE

Erin E. Reid James C. Diperna Kristen Missall Robert J. Volpe 《Psychology in the schools》2014,51(6):535-553

Currently, there are few strengths‐based preschool rating scales that sample a wide array of behaviors believed to be essential for early academic success. The purpose of this study was to assess the factor structure of a new measure of early academic competence for at‐risk preschool populations. The Teacher Rating Scales of Early Academic Competence (TRS‐EAC) includes two broad scales (Early Academic Skills and Early Academic Enablers) and was completed by 60 teachers for 440 children enrolled in Head Start and public preschool classrooms. Evidence from two exploratory factor analyses supported a five‐factor solution for the Early Academic Skills Scale (Creative Thinking, Critical Thinking Skills, Numeracy, Early Literacy, and Comprehension) and a five‐factor solution for the Early Academic Enablers Scale (Approaches to Learning, Social and Emotional Competence, Fine Motor Skills, Gross Motor Skills, and Communication). TRS‐EAC scores also demonstrated good to excellent reliability and were related to children's performance on direct measures of early academic skills. 相似文献

17.

AN EMPIRICAL COMPARISON OF THE EFFECTS OF RECALL AND MULTIPLE-CHOICE TESTS ON STUDENT ACHIEVEMENT

Gilbert Sax LeVerne S. Collet 《Journal of Educational Measurement》1968,5(2):169-173

相似文献

18.

A COMPARISON OF THE RELIABILITY AND VALIDITY OF TWO METHODS FOR ASSESSING PARTIAL KNOWLEDGE ON A MULTIPLE-CHOICE TEST

RONALD K. HAMBLETON DENNIS M. ROBERTS ROSS E. TRAUB 《Journal of Educational Measurement》1970,7(2):75-82

Differential weighting of response alternatives and confidence testing have been proposed as ways to assess partial knowledge on multiple-choice tests. 211 students in an educational measurement course took their midterm examination under one of three procedures. Results from those students administered the test under conventional directions provided a baseline for comparing, in terms of reliability and validity, the results from students who took the test under the differential weighting of response alternatives or the confidence testing instructions. Reliability was estimated by the split-half technique. Validity was estimated by correlating midterm test scores with scores on a final examination. This investigation provides some support for the contention that validity can be improved using more sophisticated testing techniques. Suggestions for the conduct of more definitive studies were offered. 相似文献

19.

THE COMPARATIVE EFFECTS OF MULTIPLE-CHOICE VERSUS SHORT-ANSWER TESTS ON RETENTION

LORRAINE R. GAY 《Journal of Educational Measurement》1980,17(1):45-50

相似文献

20.

THE VALIDITY AND RELIABILITY OF ORAL EXAMINATIONS IN ASSESSING COGNITIVE SKILLS IN MEDICINE1

HAROLD G. LEVINE CHRISTINE H. McGUIRE 《Journal of Educational Measurement》1970,7(2):63-74

In order to attempt to assess aspects of clinical competence, not adequately assessed by other means, the Center for the Study of Medical Education, University of Illinois College of Medicine together with the American Board of Orthopaedic Surgery developed oral examinations in formats specifically designed to yield information on high level cognitive functioning. The examinations were administered to 784 candidates for certification in January 1968. Reliability of the oral problem-solving component score pooled from four examiners was approximately .50. Assessment of content, construct, and concurrent validity made by questionnaire and factor analytic studies indicated that the oral tests identified factors not measured by multiple-choice tests and, therefore, significantly improved the relationship between supervisory evaluations and test scores. 相似文献