首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract

The authors lamemt the fact that there does not seem to be much agreement as to the proper method of scoring tests The use of the scoring formula is advocated by some and criticized by others. Literature is reviewed showing that the basic assumptions behind the scoring formula (namely that all wrong answers are due to chance guessing) are false. Arguments are presented for and against the continued use of the formula, with the conclusion that its use cannot be justified. A new aspect of this question, that use of the formula may create behavior patterns detrimental to ingenuity and creativity, is also presented.  相似文献   

2.
Formula scoring is a procedure designed to reduce multiple-choice test score irregularities due to guessing. Typically, a formula score is obtained by subtracting a proportion of the number of wrong responses from the number correct. Examinees are instructed to omit items when their answers would be sheer guesses among all choices but otherwise to guess when unsure of an answer. Thus, formula scoring is not intended to discourage guessing when an examinee can rule out one or more of the options within a multiple-choice item. Examinees who, contrary to the instructions, do guess blindly among all choices are not penalized by formula scoring on the average; depending on luck, they may obtain better or worse scores than if they had refrained from this guessing. In contrast, examinees with partial information who refrain from answering tend to obtain lower formula scores than if they had guessed among the remaining choices. (Examinees with misinformation may be exceptions.) Formula scoring is viewed as inappropriate for most classroom testing but may be desirable for speeded tests and for difficult tests with low passing scores. Formula scores do not approximate scores from comparable fill-in-the-blank tests, nor can formula scoring preclude unrealistically high scores for examinees who are very lucky.  相似文献   

3.
High school seniors (84 males, 77 females) were randomly assigned to one of two treatment groups. One group received a programmed text designed to teach Ss to answer every item on an examination, whether or not the directions included a penalty for incorrect answers. The other group was administered a programmed text to teach certain selected aspects of test-wiseness. Each group served as the control group for the other. The following day all Ss were administered a measure of willingness to guess and a measure of test-wiseness. Two weeks later, all Ss received additional measures of willingness to guess and test-wiseness. Analysis of the data indicated the group that received the guessing program answered significantly more items than its control group (on both the immediate and delayed tests), even though there was a penalty for incorrect answers. In similar fashion, the group exposed to the test-wiseness program achieved significantly higher mean test-wiseness scores than its control group.  相似文献   

4.
In this paper we study the scaling behavior of nucleotide cluster in 11 chromosomes of Encephalitozoon cuniculi Genome. The statistical distribution of nucleotide clusters for 11 chromosomes is characterized by the scaling behavior of P(S) ∝ e^-αs' where S represents nucleotide cluster size. The cluster-size distribution P(S1+S2) with the total size of sequential C-G cluster and A-T cluster S1+S2 were also studied. P(S1+S2) follows exponential decay. There does not exist the case of large C-G cluster following large A-T cluster or large A-T cluster following large C-G cluster. We also discuss the relatively random walk length function L(n) and the local compositional complexity of nucleotide sequences based on a new model. These investigations may provide some insight into nucleotide cluster of DNA sequence.  相似文献   

5.
The indices of item difficulty and discrimination, the coefficients of effective length, and the average item information for both single and multiple-answer items using six different scoring formulas were computed and compared. These formulas vary in terms of the assignment of partial credit and the correction for guessing. Results show that items with multiple answers are substantially more discriminating and reliable when partial credit is given. The formulas without correction for guessing seem to perform at least as well as the formulas with correction.  相似文献   

6.
This simulation study assesses the statistical performance of two mathematically equivalent parameterizations for multitrait–multimethod data with interchangeable raters—a multilevel confirmatory factor analysis (CFA) and a classical CFA parameterization. The sample sizes of targets and raters, the factorial structure of the trait factors, and rater missingness are varied. The classical CFA approach yields a high proportion of improper solutions under conditions with small sample sizes and indicator-specific trait factors. In general, trait factor related parameters are more sensitive to bias than other types of parameters. For multilevel CFAs, there is a drastic bias in fit statistics under conditions with unidimensional trait factors on the between level, where root mean square error of approximation (RMSEA) and χ2 distributions reveal a downward bias, whereas the between standardized root mean square residual is biased upwards. In contrast, RMSEA and χ2 for classical CFA models are severely upwardly biased in conditions with a high number of raters and a small number of targets.  相似文献   

7.
The rise of computer‐based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple‐choice items. In particular, very short response time—termed rapid guessing—has been shown to indicate disengaged test taking, regardless whether it occurs in high‐stakes or low‐stakes testing contexts. This article examines rapid‐guessing behavior—its theoretical conceptualization and underlying assumptions, methods for identifying it, misconceptions regarding its dynamics, and the contextual requirements for its proper interpretation. It is argued that because it does not reflect what a test taker knows and can do, a rapid guess to an item represents a choice by the test taker to momentarily opt out of being measured. As a result, rapid guessing tends to negatively distort scores and thereby diminish validity. Therefore, because rapid guesses do not contribute to measurement, it makes little sense to include them in scoring.  相似文献   

8.
The use of sample covariance matrices constructed with pairwise deletion for data missing completely at random (SPW) is addressed in a simulation study based on 3 sample sizes (n = 200, 500, 1,000) and 5 levels of missing data (%miss = 0, 1, 10, 25, and 50). Parameter estimates were unbiased, parameter variability was largely explicable in terms of the number of nonmissing cases, and no sample covariance matrices were nonpositive definite except when %miss was 50 and the sample size was 200. However, nominal χ2 test statistics (and, thus, fit indices based on χ2s) were substantially biased by %miss and its interaction with N. Corrected χ2s based on the minimum, mean, and maximum number of nonmissing cases per measured variables and cases per covariance term (NPC) reduced but did not eliminate the bias. Empirically derived power functions did substantially better but may not generalize to other situations. Whereas the minimum NPC (the default in the SPSS version of LISREL) is probably better than most simple alternatives in many applications, the problem of how to assess fit for models fit to SPWS has no simple solution; caution is recommended, and there is need for further research with more suitable methods for this problem.  相似文献   

9.
The purpose of this study was to examine how different scoring procedures affect interpretation of maze curriculum‐based measurements. Fall and spring data were collected from 199 students receiving supplemental reading instruction. Maze probes were scored first by counting all correct maze choices, followed by four scoring variations designed to reduce the effect of random guessing. Pearson's r correlation coefficients were calculated among scoring procedures and between maze scores and a standardized measure of reading. In addition, t tests were conducted to compare fall to spring growth for each scoring procedure. Results indicated that scores derived from the different procedures are highly correlated, demonstrate criterion‐related validity, and show fall‐to‐spring growth. Educators working with struggling readers may use any of the five scoring procedures to obtain technically sound scores.  相似文献   

10.
11.
Objective: To study the stereoselective glucuronidation of carvedilol (CARV) by three Chinese liver microsomes. Methods: The metabolites of CARV were identified by a hydrolysis reaction with β-glucuronidase and HPLC-MS/MS. The enzyme kinetics for CARV enantiomers glucuronidation was determined by a reversed phase-high pressure liquid chromatogra-phy (RP-HPLC) assay using (S)-propafenone as internal standard after precolumn derivatization with 2,3,4,6-tetra-O-acetyl-β-D-glucopyranosylisothiocyanate. Results: Two CARV glucuronides were found in three Chinese liver microsomes incubated with CARV. The non-linear regression analysis showed that the values of Km and Vmax for (S)-CARV and (R)-CARV enantiomers were (118±44) μmol/L, (2 500±833) pmol/(min·mg protein) and (24±7) μmol/L, (953±399) pmol/(min·mg protein), respectively. Conclusion: These results suggested that there was a significant (P<0.05) stereoselective glucuronidation of CARV enantiomers in three Chinese liver microsomes, which might partly explain the enantioselective pharmacokinetics of CARV.  相似文献   

12.
A study was undertaken for the purpose of evaluating the degree to which the scoring of vocabulary items might be influenced by the scorer's expectancies. 72 Ss were given the task of scoring 20 responses to items on the Vocabulary subtest of the Wechsler Intelligence Scale for Children (WISC). In the bright condition (n= 37) Ss were informed that the responses were those of a child reading far above his age level, while in the dull condition (n= 35) Ss were informed that the responses were those of a child reading far below his age level. It was found that the mean score assigned to the protocol by the Ss in the bright condition was significantly higher than that assigned by the Ss in the dull condition (p less than .05). Some factors regarding generalizability were advanced and discussed.  相似文献   

13.
The F-distribution approximation suggested by Dixon was investigated at various combinations of alpha and degrees of freedom. Tabled values were compared with values computed utilizing the suggested formula. The results indicate that while the formula is not useful in all cases, the adequacy of the approximation generally increases as alpha, v1, and v2 increase. It is suggested that the approximation may be utilized when certain restrictions regarding alpha, v1, and v2 are met.  相似文献   

14.
15.
Let M‘ be a closed submanifold isometrically immersed in a unit sphere S^n p. Denote by R, H and S, the normalized scalar curvature, the mean curvature, and the square of the length of the second fundamental form of M‘, respectively. Suppose R is constant and ≥1. We study the pinching problem on S and prove a rigidity theorem for M‘ immersed in S^n p with parallel normalized mean curvature vector field. When n≥8 or, n=7 and p≤2, the pinching constant is best.  相似文献   

16.
The effectiveness of various analytical formulas for estimating R 2 shrinkage in multiple regression analysis was investigated. Two categories of formulas were identified: estimators of the squared population multiple correlation coefficient (ρ2) and those of the squared population cross-validity coefficient (ρc 2). The authors conducted a Monte Carlo experiment to investigate the effectiveness of the analytical formulas for estimating R 2 shrinkage, with 4 fully crossed factors (squared population multiple correlation coefficient, number of predictors, sample size, and degree of multicollinearity) and 500 replications in each cell. The results indicated that the most widely used Wherry formula (in both SAS and SPSS) is probably not the most effective analytical formula for estimating ρ2. Instead, the Pratt formula and the Browne formula outperformed other analytical formulas in estimating ρ2 and ρc 2, respectively.  相似文献   

17.
Sap flow and environmental conditions were monitored at two Eucalyptus (Eucalyptus urophylla S.T. Blake) plantations at Hetou and Jijia, located in Leizhou, Zhanjiang, Guangdong Province. It was found that daily sap flux density (SFD) of Eucalyptus was closely related to daily atmospheric vapor pressure deficit (VPD) (R 2=0.76, P=0.01 at Hetou and R 2=0.7021, P=0.01 at Jijia) at both sites. No significant relationship existed between daily SFD and mean daily air temperature at both sites. Daily SFD varied with wind speed Y=−17585X 3+15147X 2−1250.7X+2278.4 (R 2=0.68; P=0.01) at Hetou and Y=−101.67X 3−1.65X 2−376.4X+1914.8 (R 2=0.40, P=0.05) at Jijia, where Y was daily SFD, X was daily wind speed. Experimental observations yielded the following data: (1) the critical lower and upper daily VPD threshold were 0 and 2 kPa, within which daily SFD varied from 540±70 L/(m2·d) to 4739±115 L/(m2·d) at Hetou site, from 397±26 L/(m2·d) to 3414±191 L/(m2·d) at Jijia site; (2) Diurnal SFDs at Hetou site were much higher under low relative humidity (<30%) and slightly lower under high relative humidity (>80%) compared with those at the Jijia site; (3) The upper and lower threshold of daily and diurnal RAD for the optimal water use of E. urophylla plantations were 18±2.7 and 2±1 MJ/(m2·d), 669 and 0 J/(cm2·h) during the observation period. Project supported by Knowledge Innovation Funds from Chinese Academy of Sciences (Nos. KSCX2-SW-120, KZCX1-SW-01-01A3) and the Key Project of Guangdong Natural Science Foundation (No. 010567), China  相似文献   

18.
In matching-to-sample, comparison choice should be controlled by the identity of the sample and, when the sample is not available, by the overall probability of reinforcement associated with each of the comparisons. In the present research, pigeons were trained to match a frequent sample (appearing on 80% of the trials) to one comparison (C fr) and an infrequent sample (appearing on 20% of the trials) to the other (C inf), with the number of reinforcements associated with each sample equated. In Experiment 1, the task was identity matching; in Experiments 2 and 3, it was symbolic matching. We asked whether, when control of comparison choice by the sample was reduced (by inserting a delay between the sample and the comparisons), pigeons would choose comparisons on the basis of (1) the number of reinforcements per comparison (and thus show no comparison bias), (2) the comparison associated with the more frequent sample during training (and show a preference forC fr), or (3) the probability of reinforcement given a correct response (and show a preference forC inf), or (4) inhibition produced by nonreinforced choice of the more frequently correct comparison (and show a preference forC inf). Pigeons showed a significant tendency to chooseC fr. In Experiment 3, we showed that this bias did not result from the effects of intertrial facilitation or interference. Thus, it appears that when control of comparison choice by the sample is reduced, pigeons’ choice is controlled not merely by the probability of reinforcement but also by overall sample frequency.  相似文献   

19.
Let Γd2nbe the set of trees with a given diameter d having a perfect matching,where 2n is the number of vertex.For a tree T in Γd2n,let Pd+1be a diameter of T and q = d m,where m is the number of the edges of perfect matching inPd+1.It can be found that the trees with minimal energy in Γd2nfor four cases q = d 2,d 3,d 4,[d2],and two remarks aregiven about the trees with minimal energy in Γd2nfor2d 33q d 5 and [d2] + 1 q2d 33 1.  相似文献   

20.
Infant social–emotional development may be impacted by the COVID-19 pandemic. This study investigated associations between maternal pre- and postnatal pandemic-related concerns and social–emotional developmental risk. Data, collected in 2020–2021, came from 220 mothers (87% white, 6% Hispanic, 1% Black, 3% Asian, 1% American Indian, Mage = 32.46 years), and infants (53.18% male, Mage = 12.98 months) in the United States. Maternal postnatal pandemic-related concerns were associated with total risk scores (B = 6.09, p-value <.001) and offspring risk of scoring positive for problems related to inflexibility (B = 4.07, p-value = .006). The total score association was moderated by self-reported social support. Infants may be detrimentally impacted by the pandemic via maternal pandemic-related concerns. Maternal social support may buffer infants.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号