首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In 2018, 26 states administered a college admissions test to all public school juniors. Nearly half of those states proposed to use those scores as their academic achievement indicators for federal accountability under the Every Student Succeeds Act (ESSA); many others are planning to use those scores for other accountability purposes. Accountability encompasses a number of different uses and subsumes a variety of claims. For states proposing to use summative tests for accountability, a validity argument needs to be developed, which entails delineating each specific use of test scores associated with accountability, identifying appropriate evidence, and offering a rebuttal to counterclaims. The aim of this article is to support states in developing a validity argument for use of college admission test scores for accountability by identifying claims that are applicable across states, along with summarizing existing evidence as it relates to each of these claims. As outlined by The Standards for Educational and Psychological Testing, multiple sources of evidence are used to address each claim. A series of threats to the validity argument, including weaker alignment with content standards and potential influences in narrowing teaching, are reviewed. Finally, the article contrasts validity evidence, primarily from research on the ACT, with regulatory requirements from ESSA. The Standards and guidance addressing the use of a “nationally recognized high school academic assessment” (Elementary and Secondary Education Act (ESEA), Negotiated Rulemaking Committee; Department of Education) are the primary sources for the organization of validity evidence.  相似文献   

2.
This discussion provides a response to Gregory Cizek's "More Unintended Consequences of High-Stakes Testing." The current policy debate is characterized by extreme positions both for and against testing, and Cizek's article balances positive consequences against antitesting critics. However, there is no evidence that high-stakes testing per se has substantial positive consequences–although there is optimism that aligned educational systems, in which testing is a component, may lead to higher levels of student attainment.  相似文献   

3.
    
Most states have adopted assessment and accountability systems that involve common measures of student performance. A state assessment system that allows school districts to choose the specific strategies they use to measure student performance on state-adopted content standards presents a unique state accountability challenge. The authors propose an accountability model that addresses this challenge using a combination of student performance, technical quality, and noncognitive indicators of performance. They also describe a study that evaluated the proposed model using data from all school districts in a southern state.  相似文献   

4.
    
This article reviews the intended uses of these college‐ and career‐readiness assessments with the goal of articulating an appropriate validity argument to support such uses. These assessments differ fundamentally from today's state assessments employed for state accountability. Current assessments are used to determine if students have mastered the knowledge and skills articulated in state standards; content standards, performance levels, and student impact often differ across states. College‐ and career‐readiness assessments will be used to determine if students are prepared to succeed in postsecondary education. Do students have a high probability of academic success in college or career‐training programs? As with admissions, placement, and selection tests, the primary interpretations that will be made from test scores concern future performance. Statistical evidence between test scores and performance in postsecondary education will become an important form of evidence. A validation argument should first define the construct (college and career readiness) and then define appropriate criterion measures. This article reviews alternative definitions and measures of college and career readiness and contrasts traditional standard‐setting methods with empirically based approaches to support a validation argument.  相似文献   

5.
本文简单回顾心理测量学中效度概念发展的三个阶段,并着重分析了效度概念在现阶段的新发展——构想效度理论。可以看出,效度概念是一个不断发展的动态过程,随着研究内容的丰富化,研究方法也日益多样化。现阶段的构想效度已经足以容纳所有可能为分数的解释提供支持的证据。对效度概念的完整认识,有助于我们从一个更为宽阔的角度去认识测验的效力和实质。  相似文献   

6.
    
The article presents a framework for combining multiple measures to reach high-stakes decisions. Criteria are identified for the employment of conjunctive, compensatory, and complementary approaches to combining measures. The framework is illustrated through the documentation of the School District of Philadelphia's initiative to employ multiple measures, including standardized test scores, to determine promotion decisions. The author demonstrates that the use of multiple measures itself does not necessarily improve the reliability and validity of the decisions. It is the logic by which the measures are combined that determines the accuracy and appropriateness of the decisions reached.  相似文献   

7.
8.
9.
  总被引:1,自引:0,他引:1  
In response to heightened levels of assessment activity at the K-12 level to meet requirements of the No Child Left Behind Act of 2001, measurement professionals are called to focus greater attention on four fundamental areas of measurement research and practice: (a) improving the research infrastructure for validation methods involving judgments of test content; (b) expanding the psychometric definition of fairness in achievement testing; (c) developing guidelines for validation studies of test use consequences; and (d) preparing teachers for new roles in instruction and assessment practice. Illustrative strategies for accomplishing these goals are outlined.  相似文献   

10.
To a surprising degree, how you communicate determines your effectiveness as a teacher. Relationships are built on communication and easily destroyed by it. (Charles 2000, 48–49)  相似文献   

11.
大规模教育考试的科学属性至少应该包括教育和心理测量学约束要求和考试的社会学约束要求两个层面。考试分数的可靠性、考试结果解释和使用的有效性、考试的公平性和考试对社会、学校教育教学的影响是考试科学属性的基本要素,这是考试的目的和结果的使用决定的,每一要素都有其独特的科学内涵。考试实践中,无论是考试测量目标的制定,试卷结构的确定,考试内容样本的采集,试题情景材料的选择和设问,难度的设置,考试后的评价等都应该围绕提高考试的科学性、提高考试的质量进行。  相似文献   

12.
In this paper,the author discusses reading testing and its validity,and the requirements of reading to test takers as well as the principles which determine the validity of reading testing. By analyzing two GET - 4 model test( reading section), he shows how to make a test of great validity so as to ensure the accuracy and objectiveness of a test.  相似文献   

13.
    
A misconception exists that validity may refer only to the interpretation of test scores and not to the uses of those scores. The development and evolution of validity theory illustrate test score interpretation was a primary focus in the earliest days of modern testing, and that validating interpretations derived from test scores remains essential today. However, test scores are not interpreted and then ignored; rather, their interpretations lead to actions. Thus, a modern definition of validity needs to describe the validation of test score interpretations as a necessary, but insufficient, step en route to validating the uses of test scores for their intended purposes. To ignore test use in defining validity is tantamount to defining validity for ‘useless’ tests. The current definition of validity stipulated in the 2014 version of the Standards for Educational and Psychological Testing properly describes validity in terms of both interpretations and uses, and provides a sufficient starting point for validation.  相似文献   

14.
    
Implications of the multiple‐use of accountability assessments for the process of validation are examined. Multiple‐use refers to the simultaneous use of results from a single administration of an assessment for its intended use and for one or more additional uses. A theoretical discussion of the issues for validation which emerge from multiple‐use is provided focusing on the increased stakes that result from multiple‐use and the need to consider the interactions that may take place between multiple‐uses. To further explore this practice, an empirical study of the multiple‐use of the Education Quality and Accountability Office Grade 9 Assessment of Mathematics, a mandatory assessment administered in Ontario, Canada, is presented. Drawing on data gathered in an in‐depth case study, practices associated with two of the multiple‐uses of this assessment are considered and evidence of ways these two uses interact is presented. Given these interactions, the limitations of an argument‐based approach to validation for this instance of multiple‐use are demonstrated. Some ways that the process of validation might better address the practice of multiple‐use are suggested and areas for further investigation of this frequently occurring practice are discussed.  相似文献   

15.
语言测试研究是应用语言学的一个分支,信度和效度是语言测试领域中的两个重要概念.信度指的是考试结果的可靠性;效度指的是考试达到预定目的的程度.本文介绍了信度和效度的定义、测量方法、影响因素,并指出了语言测试中二者的相互关系是既相互依存,又相互排斥的关系.  相似文献   

16.
The attractiveness of computer-based tests (CBTs) is due largely to their capability to expand the ways we conduct testing. A relatively unexplored application, however, is actively using the computer to reduce construct-irrelevant variance while a test is being administered. This investigation introduces the effort-monitoring CBT, in which the computer monitors examinee effort (based on item response time) in a low-stakes test and displays warning messages to those exhibiting rapid-guessing behavior. The results of an experimental study are presented, which showed that an effort-monitoring CBT increased examinee effort and yielded more valid test scores than a conventional CBT. Thus, unlike previous research that has focused on identifying rapid-guessing behavior after it has occurred, the effort-monitoring CBT proactively attempts to suppress rapid-guessing behavior. This innovative testing procedure extends the capabilities of measurement practitioners to manage the psychometric challenges posed by unmotivated examinees.  相似文献   

17.
    
Current high-stakes, standardized testing policy is discussed through historical analogy with Chairman Mao's famine in China and the Maginot Line in France. Both of these national, high-stakes policies resulted in catastrophic failure. If the accountability movement's goals are to improve our ability to compete economically with other nations, we may also be heading for failure. Assessment should be aligned with the skills and knowledge crucial to our success in the future such as collaboration, experimentation, and comfort with ambiguity. Funds currently allocated to standardized testing should be reallocated to the development of measures for such skills and knowledge.  相似文献   

18.
    
The present study examined the use of student test performance for merit pay and teacher evaluation as predictive of both educator stress and counterproductive teaching practices, and the moderating role of perceived test value. Structural equation modelling of data from a sample of 7281 educators in a South-eastern state in the United States supported the hypothesis that educators who perceived the test as an invalid measure of teaching effectiveness were more likely to report high levels of test stress and to use counterproductive teaching practices, including fear appeals, in an attempt to motivate students for test-taking. This study provides initial evidence for the hypothesised relationships of test-based accountability policy with teacher mental health and instructional practices. Implications for research and practice are discussed.  相似文献   

19.
    
Recent test‐based accountability policy in the U.S. has involved annually assessing all students in core subjects and holding schools accountable for adequate progress of all students by implementing sanctions when adequate progress is not met. Despite its potential benefits, basing educational policy on assessments developed for a student population of White, middle‐ and upper‐class, and native speakers of English opens the door for numerous pitfalls when the assessments are applied to minority populations including students of color, low SES, and learning English as a new language. There exists a paradox; while minority students are a primary intended beneficiary of the test‐based accountability policy, the assessments used in the policy have been shown to have many shortcomings when applied to these students. This article weighs the benefits and pitfalls that test‐based accountability brings for minority students. Resolutions to the pitfalls are discussed, and areas for future research are recommended. © 2009 Wiley Periodicals, Inc. J Res Sci Teach 47: 6–24, 2010  相似文献   

20.
    
This article examines multiple measures of performance in school accountability systems from two perspectives: laterally (different indicators of different domains) and vertically (indicators that are at different levels of depth of the same domain). Organizational responsibility and instructional sensitivity are examined. In particular, alternative procedures are explored for integrating into the multiple measures concept external, uniform top-down measures and responsive, locally adaptive bottom-up measures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号