共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
3.
Stephan F. Gohmann 《Journal of Educational Measurement》1988,25(2):137-148
Comparing SAT scores among states using regression analysis leads to biased results because states differ in the proportion of students taking the exam. When the proportion of students taking the exam is included in the regression equation, the results can be biased because of misspecifieation bias. A method intended to correct for selection bias is presented, and empirical results suggest that sample selection bias is present in SAT score regressions. Regression equations and state rankings are compared between the selection-corrected equation and equations for which the selection problem is not addressed. The proposed method is one of many available as possible solutions to the selection problem. Alternative methods may produce different results 相似文献
4.
5.
Grades and Test Scores: Accounting for Observed Differences 总被引:1,自引:0,他引:1
Warren W. Willingham Judith M. Pollack Charles Lewis 《Journal of Educational Measurement》2002,39(1):1-37
Why do grades and test scores often differ? A framework of possible differences is proposed in this article. An approximation of the framework was tested with data on 8,454 high school seniors from the National Education Longitudinal Study. Individual and group differences in grade versus test performance were substantially reduced by focusing the two measures on similar academic subjects, correcting for grading variations and unreliability, and adding teacher ratings and other information about students. Concurrent prediction of high school average was thus increased from 0.62 to 0.90; differential prediction in eight subgroups was reduced to 0.02 letter‐grades. Grading variation was a major source of discrepancy between grades and test scores. Other major sources were teacher ratings and Scholastic Engagement, a promising organizing principle for understanding student achievement. Engagement was defined by three types of observable behavior: employing school skills, demonstrating initiative, and avoiding competing activities. While groups varied in average achievement, group performance was generally similar on grades and tests. Major factors in achievement were similarly constituted and similarly related from group to group. Differences between grades and tests give these measures complementary strengths in high‐stakes assessment. If artifactual differences between the two measures are not corrected, common statistical estimates of validity and fairness are unduly conservative. 相似文献
6.
SAT(Scholastic Assessment Test)作为美国目前广为接受的大学入学考试,其公平性一直遭受质疑,尤其是在性别、种族等敏感领域。基于美国某高中学生的SAT数据,运用最小二乘估计法,建立了关于SAT考试成绩的单方程线性回归模型。回归结果显示在保持模型中其他因素不变的情况下,SAT考试的确存在性别和种族歧视,且性别对成绩的影响要大于种族对成绩的影响。最后结合2016年SAT考试的公平性改革,探究SAT的未来发展方向及对我国新高考改革的借鉴。 相似文献
7.
SAT考试:高考制度改革可资借鉴的一面铜镜 总被引:3,自引:2,他引:3
自1999年开始,我国高考制度改革的重心实现了向考试科目设置以及高考形式和内容的改革方向的转移,江苏、浙江、吉林和山西四省分别推出了“3 综合”的考试新模式,广东省也积极进行了“3 X”考试模式的新探索,并将逐步推广到全国其他省市自治区。新一轮普通高校的招生考试制度改革,普遍摒弃了以往以单纯的知识测试作为录取学生的唯一依据的传统考试模式,突出和强调了对学生综合素质的考察,这反映了高等教育“大众化”发展趋势的要求,也体现了人们对于实施素质教育思想的高度认同。一、 素质教育的实施,要求我们必须改变传统的教育… 相似文献
8.
9.
10.
11.
Are variations in test-preparation practices from school to school undermining the meaningfulness of achievement test results? Is there pressure to raise achievement test scores by the use of educationally unsound practices? What uses of achievement test scores are most common? Do teachers and administrators have reasonably accurate views of test score uses? 相似文献
12.
《教育实用测度》2013,26(2):103-118
Assessment instruments of the future will probably be composed of a combination of different types of questions. Even though different kinds of questions require different scoring procedures, there may be a need to have those different scores combined as a composite. In this article, we describe how mixtures of such scores may be efficaciously combined. Also, if no post hoc adjustment is desired, we provide two characterizations of measurement effectiveness to aid in making unadjusted score combinations efficient. In addition, we explore the implications for test construction of some typical findings. 相似文献
13.
In studies of the SAT, correlations of SAT scores, high school grades, and socioeconomic factors (SES) are usually obtained using a university as the unit of analysis. This approach obscures an important structural aspect of the data: The high school grades received by a given institution come from a large number of high schools, all of which have potentially different grading standards. SAT scores, on the other hand, can be assumed to have the same meaning across high schools. Our analyses of a large national sample show that, when pooled within-high-school analyses are applied, high school grades and class rank have larger correlations with family income and education than is evident in the results of typical analyses, and SAT scores have smaller associations with socioeconomic factors. SAT scores and high school grades, therefore, have more similar associations with SES than they do when only the usual across-high-school correlations are considered . 相似文献
14.
The use of assessment results to inform school accountability relies on the assumption that the test design appropriately represents the content and cognitive emphasis reflected in the state's standards. Since the passage of the Every Student Succeeds Act and the certification of accountability assessments through federal peer review practices, the content validity arguments supporting accountability have relied almost exclusively on the alignment of statewide assessments to state standards. It is assumed that if alignment does not hold, the scores will not provide valid inferences regarding the degree to which test takers have performed. Although alignment results are commonly used as evidence of test appropriateness, Polikoff (this issue) would argue that given the importance of alignment in policy decisions, research related to alignment is surprisingly limited. Few studies have addressed the adequacy of alignment methodologies and results as support for the inferences to be made (i.e., proficient on state standards). This paper uses an example of test taker performance (and common performance indicators) to investigate to what extent the degree of alignment impacts inferences made about performance (i.e., classification into performance levels, estimates of student ability, and student rank order). 相似文献
15.
Sorel Cahan 《Educational Measurement》2000,19(3):26-32
How does schooling affect the development of intelligence in children? How should the amount of schooling be considered when developing norms for turning intelligence test performance into IQ scores? 相似文献
16.
《The Journal of educational research》2012,105(6):440-446
ABSTRACTPrevious studies have shown that several key variables influence student achievement in geometry, but no research has been conducted to determine how these variables interact. A model of achievement in geometry was tested on a sample of 102 high school students. Structural equation modeling was used to test hypothesized relationships among variables linked to successful problem solving in geometry. These variables, including motivation, achievement emotions, pictorial representation, and categorization skills, were examined for their influence on geometry achievement. Results indicated that the model fit well. Achievement emotions, specifically boredom and enjoyment, had a significant influence on student motivation. Student motivation influenced students’ use of pictorial representations and achievement. Pictorial representation also directly influenced achievement. Categorization skills had a significant influence on pictorial representations and student achievement. The implications of these findings for geometry instruction and for future research are discussed. 相似文献
17.
Andrew C. Dwyer 《Journal of Educational Measurement》2016,53(1):3-22
This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common‐item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common‐item equating methodology to standard setting ratings to account for systematic differences between standard setting panels) has received almost no attention in the literature. Identity equating was also examined to provide context. Data from a standard setting form of a large national certification test (N examinees = 4,397; N panelists = 13) were split into content‐equivalent subforms with common items, and resampling methodology was used to investigate the error introduced by each approach. Common‐item equating (circle‐arc and nominal weights mean) was evaluated at samples of size 10, 25, 50, and 100. The standard setting approaches (resetting and rescaling the standard) were evaluated by resampling (N = 8) and by simulating panelists (N = 8, 13, and 20). Results were inconclusive regarding the relative effectiveness of resetting and rescaling the standard. Small‐sample equating, however, consistently produced new form cut scores that were less biased and less prone to random error than new form cut scores based on resetting or rescaling the standard. 相似文献
18.
在应试教育的背景下,考试分数的作用被无限夸大。考试分数的强化窄化了评价视域,简化了课程目标,进而异化了基础教育。异化的教育又将考试分数的虚高价值进一步推向极致,最终形成教育的怪圈。本文在对考试分数的不当使用进行案例分析的基础上,提出正确理解与把握评价的几对关系,以期对走出教育的怪圈有所启示。 相似文献
19.