首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
This study explores the use of multiple measures to enhance the validity and reliability of inferences about school and district effectiveness. Using data from the state of Ohio, a framework for combining measures is applied to examine the individual and collective impact of multiple measures on both the federal AYP designations and state ratings. Implications for the use of multiple measures to broaden the construct of district and school effectiveness, improve the acceptance and legitimacy of accountability programs, and promote desired outcomes are discussed.  相似文献   

The use of multiple measures is emphasized by legislation regulating the distribution of Title I funding to states, as well as by professional and industry standards regarding the use of test scores in high-stakes decisions. There are a wide variety of methods with which multiple measures can be designed and analyzed, and these methods have different implications for conclusions that will be reached. Recognizing the complexities associated with the implementation of a multiple measures approach to system evaluation, this article provides an overview and discussion of alternative models that may be considered in an accountability system and their applicability relative to the goals of the system evaluation. The article concludes with an example of the use of multiple measures with regard to No Child Left Behind legislation.  相似文献   

Using Multiple Measures to Address Perverse Incentives and Score Inflation   总被引:1,自引:0,他引:1  
The principle that important decisions should not be based on a single measure is axiomatic, if widelu ignored in practice. The traditional rationale is the risk of incorrect decisions from incomplete and error-prone data. The current high-stakes uses of test scores increase the need for multiple measures for two distinct reasons: the risk of score inflation and the potential for perverse incentives for educators and students. Addressing these two issues may require focusing accountability on measures of schooling as well as a much wider range of measures of student outcomes. The difficulties of pursuing this approach are described, and some possible directions for research and development are noted.  相似文献   

Several meanings of the term multiple measures exist. One of these is the use of assessments from different sources, such as an external test, along with a state-developed test. The use of multiple sources is increasing, especially due to increased federal Title I requirements for state accountability programs and associated increases in the amount and costs of mandated testing. Several issues seem pertinent for states considering combining assessments from internal sources (usually criterion-referenced tests) and external sources (usually norm-referenced tests) into their accountability programs. These are explored from the standpoint of the impact of federally required decision making for schools based on test data. Other possible uses are mentioned briefly.  相似文献   

Nonprofit organizations' constituents, trustees and donors find financial reports confusing and too detailed. Advocates of Service Efforts and Accomplishments and assessment measures suggest that disclosing both financial and nonfinancial indicators will make the reports more useful. With hundreds of possible measures, ratios and data available, the nonprofit manager must select measures that portray the institution's condition and performance. This article presents a process and rationale for determining the measures. Also presented is a set of measures developed from interviews with presidents and administrators which were selected by a quantitative process that models complex decisions. The information selected for display both describes and measures the organization's condition and performance.  相似文献   

It is widely accepted dogma that consequential decisions are better made with multiple measures, because using but a single one is thought more likely to be laden with biases and errors that can be better controlled with a wider source of evidence for making judgments. Unfortunately, advocates of using multiple measures too rarely provide detailed directions of exactly how to use them. In this article, we describe one source of problems with multiple measures and lay out a strategy for finding solutions.  相似文献   

This article examines multiple measures of performance in school accountability systems from two perspectives: laterally (different indicators of different domains) and vertically (indicators that are at different levels of depth of the same domain). Organizational responsibility and instructional sensitivity are examined. In particular, alternative procedures are explored for integrating into the multiple measures concept external, uniform top-down measures and responsive, locally adaptive bottom-up measures.  相似文献   

Multiple measures, such as multiple content domains or multiple types of performance, are used in various testing programs to classify examinees for screening or selection. Despite the popular usages of multiple measures, there is little research on classification consistency and accuracy of multiple measures. Accordingly, this study introduces an approach to estimate classification consistency and accuracy indices for multiple measures under four possible decision rules: (1) complementary, (2) conjunctive, (3) compensatory, and (4) pairwise combinations of the three. The current study uses the IRT-recursive-based approach with the simple-structure multidimensional IRT model (SS-MIRT) to estimate the classification consistency and accuracy for multiple measures. Theoretical formulations of the four decision rules with a binary decision (Pass/Fail) are presented. The estimation procedures are illustrated using an empirical data example based on SS-MIRT. In addition, this study applies the estimation procedures to the unidimensional IRT (UIRT) context, considering that UIRT is practically used more. This application shows that the proposed procedure of classification consistency and accuracy could be used with a UIRT model for individual measures as an alternative method of SS-MIRT.  相似文献   

刑诉法修正案草案将强制措施做了较大修改,占到了此次修改条款的四分之一左右,足见立法者的重视,其中司法审查原则有所涉及,是其进步之处。但是并未将司法机关签发速捕证、拘留证等进行规定。而且也未涉及审查羁押的合法性和必要性。我国应当适当将域外刑事强制措施的司法调控,引入我国的司法调控体制并加以完善。  相似文献   

The authors attempt to bridge the gap between the research literature on supervision and its application to evaluating individual supervisor effectiveness. A conceptual framework is presented for making decisions about evaluation. In this framework the decision maker is directed to consider three issues—the purpose of evaluation, the developmental stage of the counselor, and the focus of evaluation—in selecting measures of effectiveness. Within this context, some promising measures of supervision effectiveness are discussed, and methods for linking changes in supervisee functioning to supervisor interventions are considered.  相似文献   

This paper proposes measures to improve adolescent reproductive health programs through a sociocultural perspective. The measures include 1) identifying the problem and understand its nature within its cultural context; 2) assessing the sociocultural context to obtain a clear understanding of the cultural aspects that affect adolescent sexuality and pregnancy; 3) handling cultural biases and ensuring accuracy of information; 4) identifying the specific needs of adolescents and view their problems from their own perspective; 5) incorporating a true gender approach; 6) reaching out to young men; 7) involving adolescents in all stages of the programs; 8) communicating effectively with adolescents in all stages of the programs; 9) developing skills to avoid risks; 10) generating capacity to make informed decisions; 11) developing services that are accessible to adolescents; 12) sensitizing health personnel; 13) developing a multidisciplinary approach; and 14) creating an appropriate environment for the program. The brief explanation of each measure is presented.  相似文献   

Teachers make a difference in student academic growth. Students from low-income, minority communities attend schools with less resources and less qualified teachers than students in wealthier communities. The Race to the Top (RTTT) policy by the U.S. Department of Education has attempted to address the achievement gap based on SES and the disparity in the quality of teachers between communities. The policy stipulates that teacher effectiveness be determined, in significant part, by student growth measures and supplemented with multiple observation-based assessments. The emphasis placed on student outcomes to indicate teacher effects has served to link teacher evaluations with teacher effectiveness. This review article examines the reported benefits and critical responses to the use of a prominent student growth measure, the Education Value-Added Assessment System (EVAAS), in terms of its implementation as an evaluation tool of teacher effectiveness in low-income, minority schools. Models of observational teacher evaluations, taking into consideration common attributes of effective teachers in low-income schools, are presented as supplemental measures to provide more in-depth information to interpret value-added analyses and to minimize possible misinterpretation of student growth data or the misclassification of teachers’ effectiveness for teachers in low-income schools. Information obtained from a combination of evaluation measures can be used to identify both effective and ineffective teachers, to target areas in need of improvement to increase teacher effectiveness, and to make decisions concerning the equitable distribution of effective teachers, especially for students who are most in need.  相似文献   

School psychologists are tasked with ensuring treatment integrity because the level of intervention implementation affects decisions about student progress. Treatment integrity includes multiple dimensions that may impact the effectiveness of an intervention including adherence, dosage, quality, and engagement. Unfortunately, treatment integrity is not routinely monitored in consultation. A systematic framework is needed to better prepare practitioners to assess, analyze, and intervene when there are treatment integrity failures. A framework for monitoring and improving multiple dimensions of treatment integrity in natural settings is proposed to provide guidance to practitioners through two phases. The first phase focuses on improving initial treatment integrity and the second phase outlines a problem‐solving process for improving treatment integrity.  相似文献   

A student’s motivational orientation is considered to be a predictor of a range of related education decisions, from attending classes to choosing a particular course or a profession. This survey study conducted with student volunteers (males = 519; females = 904) enrolled in secondary school science-math academic stream in Thailand investigated the relationship between measures of motivation (achievement goal orientation and physics and biology classroom anxiety) and aspirations for high earning science and math related careers. Results of multiple discriminant analyses showed gender differences in the motivational factors that influence career aspirations. Our interpretation of the findings highlights the significance of cultural beliefs about gender in decision making for careers.  相似文献   

在Watson统计素养三层次框架的基础上形成了关于集中量数理解的3个水平的思想框架,利用该框架编制的测试卷对来自上海某重点高中155名学生和某普通高中176名学生的测试和其中11名学生的访谈,发现大部分学生能正确计算集中量数,能结合实际问题背景使用恰当的集中量数,但在对所使用的集中量数进行解释时有较大困难.这些学生对众数和平均数的掌握要好于中位数.重点高中学生对集中量数的理解水平在统计意义下显著好于普通高中学生.  相似文献   

深化文化体制改革,发展文化产业是党中央做出的关系全局的重大决策,也是河北省更快更好发展、构建和谐河北面临的紧迫而重大的任务.当前河北省文化体制改革过程中面临着许多问题,制约了文化体制改革的进程.本文通过对河北省文化体制改革的现状进行分析,深入剖析我省当前进行文化体制改革面临的现实困难与问题,并提出了进一步加强我省文化体制改革的措施.  相似文献   

This real‐data‐guided simulation study systematically evaluated the decision accuracy of complex decision rules combining multiple tests within different realistic curricula. Specifically, complex decision rules combining conjunctive aspects and compensatory aspects were evaluated. A conjunctive aspect requires a minimum level of performance, whereas a compensatory aspect requires an average level of performance. Simulations were performed to obtain students' true and observed score distributions and to manipulate several factors relevant to a higher education curriculum in practice. The results showed that the decision accuracy depends on the conjunctive (required minimum grade) and compensatory (required grade point average) aspects and their combination. Overall, within a complex compensatory decision rule the false negative rate is lower and the false positive rate higher compared to a conjunctive decision rule. For a conjunctive decision rule the reverse is true. Which rule is more accurate also depends on the average test reliability, average test correlation, and the number of reexaminations. This comparison highlights the importance of evaluating decision accuracy in high‐stake decisions, considering both the specific rule as well as the selected measures.  相似文献   

The current widespread availability of software packages with estimation features for testing structural equation models with binary indicators makes it possible to investigate many hypotheses about differences in proportions over time that are typically only tested with conventional categorical data analyses for matched pairs or repeated measures, such as McNemar’s chi-square. The connection between these conventional tests and simple longitudinal structural equation models is described. The equivalence of several conventional analyses and structural equation models reveals some foundational concepts underlying common longitudinal modeling strategies and brings to light a number of possible modeling extensions that will allow investigators to pursue more complex research questions involving multiple repeated proportion contrasts, mixed between-subjects × within-subjects interactions, and comparisons of estimated membership proportions using latent class factors with multiple indicators. Several models are illustrated, and the implications for using structural equation models for comparing binary repeated measures or matched pairs are discussed.  相似文献   

保证毕业设计质量的几点措施   总被引:5,自引:0,他引:5  
毕业设计是工科专业必不可少的重要教学环节,是培养学生工程实践能力,综合运用专业知识的重要教学过程,本文针对毕业设计存在的问题,结合多年来的实践,从毕业设计选题、指导教师素质、毕业设计管理、计算机应用、成绩评定等方面探讨保证毕业设计质量的措施,提出了一些切实可行的做法。  相似文献   

Standardized tests have been increasingly controversial over recent years in high-stakes admission decisions. Their role in operationalizing definitions of merit and qualification is especially contested, but in law schools this challenge has become particularly intense. Law schools have relied on the Law School Admission Test (LSAT) and an INDEX (which includes grade point average [GPA]) since the 1940s. The LSAT measures analytic and logical reasoning and reading. Research has focused on the validity of the LSAT as a predictor of 1st-year GPA in law school, with almost no research on predicting lawyering effectiveness. This article examines the comparative potential between the LSAT versus noncognitive (e.g., personality, situational judgment, and biographical information) predictors of lawyering effectiveness. Theoretical links between 26 lawyering effectiveness factors and potential predictors are discussed and evaluated. Implications for broadening the criterion space, diversity in admissions, and the practice of law are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号