首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
There are few empirical investigations of the consequences of using widely recommended data collection procedures in conjunction with a specific standardsetting method such as the Angoff (1971) procedure. Such recommendations include the use of several types of judges, the provision of normative information on examinees' test performance, and the opportunity to discuss and reconsider initial recommendations in an iterative standard-setting procedure. This study of 236 expert judges investigated the effects of using these recommended procedures on (a) average recommended test standards, (b) the variability of recommended test standards, and (c) the reliability of recommended standards for seven subtests of the National Teacher Examinations Communication Skills and General Knowledge Tests. Small, but sometimes statistically significant, changes in mean recommended test standards were observed when judges were allowed to reconsider their initial recommendations following review of normative information and discussion. Means for public school judges changed more than did those for college or university judges. In addition, there was a significant reduction in the within-group variability of standards recommended for several subtests. Methods for estimating the reliability of recommended test standards proposed by Kane and Wilson (1984) were applied, and their hypothesis of positive covariation between empirical item difficulties and mean recommended standards was confirmed. The data collection procedures examined in this study resulted in substantial increases in the reliability of recommended test standards.  相似文献   

2.
The K‐5 reading standards within the English Language Arts Common Core State Standards provide guidance to teachers about grade level expectations for students. Though the authors of the standards acknowledge that some students may experience difficulty reaching the rigorous expectations, they explain that the standards outline a pathway to proficiency for all students, including those who struggle with literacy. Students with learning disabilities, who often have significant literacy difficulties, may face particular challenges when their instruction is framed by these standards. This article unpacks the complex K‐5 reading standards and provides a discussion of the implications for students with learning disabilities and their general and special education teachers. Examples from K‐5 lessons and recommendations for teachers and researchers are provided.  相似文献   

3.
The purpose of the present study was to extend past work with the Angoff method for setting standards by examining judgments at the judge level rather than the panel level. The focus was on investigating the relationship between observed Angoff standard setting judgments and empirical conditional probabilities. This relationship has been used as a measure of internal consistency by previous researchers. Results indicated that judges varied in the degree to which they were able to produce internally consistent ratings; some judges produced ratings that were highly correlated with empirical conditional probabilities and other judges’ ratings had essentially no correlation with the conditional probabilities. The results also showed that weighting procedures applied to individual judgments both increased panel-level internal consistency and produced convergence across panels.  相似文献   

4.
The Northern Ireland Curriculum, like the English National Curriculum, records pupil achievement on a 10‐level scale. The level to which a pupil is ‘assigned’ at the end of a Key Stage is based upon two sources of assessment information: classroom‐based measures provided by the teacher and summative information from Common Assessment Instruments (CAIs), which are pen‐and‐paper tests taken at the end of the Key Stage. CAIs play a central role in confirming the accuracy with which teachers judge the level at which a pupil is working. While the teacher might judge a pupil to have mastered level 7 in Algebra, for example, based upon observation in class, test data and homeworks, the CAI will only confirm this level if the pupil scores above the level 7 cutscore on the CAI. If this cutscore does not accord with a reliable measure of what constitutes level 7 performance in Algebra in the classroom, there is likely to be misclassification of pupils with attendant difficulties for the efficient planning of teaching and learning. Misclassifications can be minimised when examiners and teachers interpret level 7 achievement in Algebra similarly. The Angoff standard‐setting procedure was used to establish level 5 cutscores in the Number and Handling Data tests of the mathematics CAI so that comparisons might be made between the published level 5 cutscores and those which result from a judgemental standard‐setting procedure. The 21 teachers involved in the procedure were offered the opportunity to recommend a level 5 ‘standard’ using the Angoff methodology, and to review their recommendations in the light of test data from the February 1993 CAI administration. A further opportunity was offered following a discussion during which individual teachers articulated their reasons for the standards they recommended. The results confirm that the reliability of recommended standards increases both as a consequence of receiving normative data and of discussion. All statistical measures reported in this article indicate that the procedure could command the confidence of examiners, teachers and the public. While the recommended cutscore for Number is in close accord with that published by the examiners, the extent of the mismatch in the Handling Data test is such as might give rise to some misclassification of pupils. It is important to stress that this mismatch had no real consequences since 1993 was a pilot year and no test outcomes were reported. The article concludes with an outline of the contribution which the Angoff methodology can make to the resolution of some of the difficulties faced by English national assessment, as identified in Sir Ron Dealing's interim report “The National Curriculum and its Assessment”.  相似文献   

5.
Who should make judgments about test standards? Who is an expert? How many judges should be used in a standard-setting study? What is the relationship between the number of judges and the standard error of the test?  相似文献   

6.
7.
The Survey of Assessment Beliefs (SAB) was developed to measure teacher candidates' perceptions about grading practices. After piloting, the SAB was administered to 222 teacher candidates at a large northeastern urban university, along with a measure of their beliefs about teaching. Candidates were found to support many grading practices not recommended by professional standards. Support for grading practices that deviate from professional recommendations was positively associated with support for constructivist approaches. Significant differences were found in grading and teaching attitudes between elementary and secondary education teacher candidates. Teacher candidates became more moderate in endorsing nonstandard grading practices following coursework in classroom assessment but on average maintained a tendency to approve academically enabling grading practices. This study provides empirical evidence about possible areas of tension between constructivist learning theory and principles of educational measurement, and it helps classroom assessment teachers understand the needs of their target audiences.  相似文献   

8.
This paper provides a detailed analysis of the inclusion of aspects of nature of science (NOS) in the Next Generation Science Standards (NGSS). In this new standards document, NOS elements in eight categories are discussed in Appendix H along with illustrative statements (called exemplars). Many, but not all, of these exemplars are linked to the standards by their association with either the “practices of science” or “crosscutting concepts,” but curiously not with the recommendations for science content. The study investigated all aspects of NOS in NGSS including the accuracy and inclusion of the supporting exemplar statements and the relationship of NOS in NGSS to other aspects of NOS to support teaching and learning science. We found that while 92 % of these exemplars are acceptable, only 78 % of those written actually appear with the standards. “Science as a way of knowing” is a recommended NOS category in NGSS but is not included with the standards. Also, several other NOS elements fail to be included at all grade levels thus limiting their impact. Finally, NGSS fails to include or insufficiently emphasize several frequently recommended NOS elements such as creativity and subjectivity. The paper concludes with a list of concerns and solutions to the challenges of NOS in NGSS.  相似文献   

9.
The purpose of this study was to determine if a linear procedure, typically applied to an entire examination when equating scores and reseating judges' standards, could be used with individual item data gathered through Angoffs standard-setting method (1971). Specifically, experts estimates of borderline group performance on one form of a test were transformed to be on the same scale as experts' estimates of borderline group performance on another form of the test. The transformations were based on examinees' responses to the items and on judges' estimates of borderline group performance. The transformed values were compared to the actual estimates provided by a group of judges. The equated and reseated values were reasonably close to those actually assigned by the experts. Bias in the estimates was also relatively small. In general, the reseating procedure was more accurate than the equating procedure, especially when the examinee sample size for equating was small.  相似文献   

10.
This study was designed to investigate the extent to which a policy regarding graduate admission standards exists among selected graduate faculty members at Colorado State College. Twenty judges utilized Normative Judgment Analysis techniques to generate a projected criterion of graduate school success on profile data for thirty randomly selected doctoral graduates. The results of the study indicate that essentially one policy was being expressed by the judges.  相似文献   

11.
对新会计准则环境下的高校会计教育工作的思考   总被引:2,自引:0,他引:2  
与原会计准则相比较,新会计准则在公允价值、存货、无形资产等方面做了重大改变。通过阐述新会计准则相对于旧准则的主要变化,浅析新会计准则实施后对高校会计专业课程体系、教师专业知识、会计人才培养及评价等教育工作产生的影响,提出应对这些影响的办法。  相似文献   

12.
OBJECTIVES: Our goal was to examine how professionals assess children at risk and their parents, and decide on particular interventions. Specifically, we explored whether their assessments and decision-making are influenced by (1) the mother's degree of cooperativeness and/or (2) the country in which the worker lives (Canada or Israel). METHOD: Workers working in the child welfare field (N = 181) were presented with a case vignette and asked to assess the child and parents, and the degree of risk to the child, and make an intervention recommendation. The measures used in this study were based on previous work and field-tested in both countries. RESULTS: Significant differences were found between the two countries regarding workers' age and level of experience, with Canadians being older and more experienced than Israelis. Significant differences were also found between the two countries regarding the assessments of the child and parents and also of risk to the child, with Canadians assessing significantly more stringently than Israelis. The difference in levels of experience between the two countries did not explain these differences; however, it did influence intervention recommendations, only for those with 3 years or more of experience. Within this group, significantly more Canadians than Israelis recommended removing the child from the home. Regarding maternal cooperativeness, this factor did affect workers' assessments of the mother, but not of the father or child, or the worker's recommended intervention. Israelis' assessments were significantly more influenced by the mother's cooperativeness than Canadians'. CONCLUSIONS: Significant differences were found between the Canadian and Israeli professionals in this study in both their assessments and their intervention recommendations. These appear to reflect the different social, cultural, and political contexts in which these professionals work, and underscores the value of cross-national comparative studies in child welfare.  相似文献   

13.
An increasingly regulated higher education sector is renewing its attention to those activities referred to as ‘moderation’ in its efforts to ensure that judgements of student achievement are based on appropriate standards. Moderation practices conducted throughout the assessment process can result in purposes identified as equity, justification, accountability and community building. This paper draws on the limited studies of moderation and wider relevant research on judgement, standards and professional learning to test commonly used moderation practices against these identified purposes. The paper concludes with recommendations for maximising the potential of moderation practices to establish and maintain achievement standards.  相似文献   

14.
在英国,法律和政治的关系主要体现在宪法性权利的保护上,因为宪法性权利不仅要影响英国议会议员、政府官员以及司法官员的特定是非观念,而且要影响其立法过程、执法过程和司法过程的程序。就政治与法律之间关系的一般理论而言,直接关涉到英国的政治、法律和人的活动,并体现英国宪法性人权与立法之关系以及英国宪法性人权的保护方式,进而通过英国人权法与英国法官具有政治性的裁决之间的相互作用诠释其对英国社会可能产生的影响以及型构英国法律和政策的制定程序。  相似文献   

15.
Counselor education, through its formal organization (ACES), is in the process of examining its standards for program accreditation used for the past decade. This discussion examines current educational forces, such as social change, political conditions, and learning process developments as a basis for a new look at the standards for counselor preparation. Based on these influences, four categories of situational and philosophical issues are postulated: educational and psychological assumptions, outcome behaviors of the counselor education graduate, outcomes of counselor education programming, and outcome behaviors of professional guidance personnel. It is recommended that outcomes of counselor education should be considered in performance terms before instructional strategies are developed.  相似文献   

16.
法官资源的合理配置问题不但一直没有受到学术界的重视,而且在整个法院系统内进行的深度思考也较为稀缺。目前,我国法官配置存在着法官职称泛化、法官地区分布失衡、法官工作效率失衡等现象,解决途径包括精编法官队伍,量化法官任务,对法官资源进行动态管理,让其在纵向和横向上流动起来,才可能达到优化配置,发挥最大的作用。  相似文献   

17.
Abstract

The purpose of this study was to determine the status of preparation programs leading to licensure in broad‐field social studies, grades 7–12, at state regional colleges and universities and the relationship of these programs of study to National Council of Social Studies (NCSS) Standards, state licensure requirements, degree programs of study at state supported major research institutions, and major reports on educational reform. The research found that of the 30 institutions studied only 21 offered degree programs leading to licensure in broad‐field social studies. All 21 institutions exceeded state minimum requirements and although the mean semester hour requirements in general education, professional education, and specialty studies components minimally exceeded the NCSS recommended standards and were similar to those at major research institutions, there was an absence of assurance by institutions that students possessed a depth of understanding in the disciplines within the major. Likewise, degree programs have been minimally affected by the recommendations of the major educational reform reports.  相似文献   

18.
Setting performance standards is a judgmental process involving human opinions and values as well as technical and empirical considerations. Although all cut score decisions are by nature somewhat arbitrary, they should not be capricious. Judges selected for standard‐setting panels should have the proper qualifications to make the judgments asked of them; however, even qualified judges vary in expertise and in some cases, such as highly specialized areas or when members of the public are involved, it may be difficult to ensure that each member of a standard‐setting panel has the requisite expertise to make qualified judgments. Given the subjective nature of these types of judgments, and that a large part of the validity argument for an exam lies in the robustness of its passing standard, an examination of the influence of judge proficiency on the judgments is warranted. This study explores the use of the many‐facet Rasch model as a method for adjusting modified Angoff standard‐setting ratings based on judges’ proficiency levels. The results suggest differences in the severity and quality of standard‐setting judgments across levels of judge proficiency, such that judges who answered easy items incorrectly tended to perceive them as easier, but those who answered correctly tended to provide ratings within normal stochastic limits.  相似文献   

19.
Social networks enable people with intellectual disabilities (ID) to participate actively in society and to promote their self-determination. However, concerns have been raised regarding the potential limitations of people with ID to deal with untrustworthy information sources on the Internet. In an experiment, we assessed how adult students with ID evaluated recommendations in Internet forums authored by either self-reported experts or by users under pseudonyms who supported their claim either with documentary sources or their personal experience. We compared the performances of students with ID to that of students of similar ages but higher educational levels (chronological age-matched control group) and to younger students with similar verbal mental age (verbal mental age-matched control group). Participants were asked to evaluate to what extent a fictitious user should follow particular recommendations given in a forum and to justify their evaluations by writing a message to the fictitious user. Students with ID, as opposed to the two control groups, recommended the forum advice to a higher extent regardless of authorship and evidence used, and they included in their messages to the fictitious user a higher number of opinions and information sources not present in the forum without linking them to the actual discussion. The pattern of results suggested that students with ID have a limited ability to evaluate recommendations in forums and that they do not necessarily present a delay in the development of these abilities, but rather an atypical development. Finally, we discussed the potential implications for teaching digital literacy to students with ID.  相似文献   

20.
Cut‐scores were set by expert judges on assessments of reading and listening comprehension of English as a foreign language (EFL), using the bookmark standard‐setting method to differentiate proficiency levels defined by the Common European Framework of Reference (CEFR). Assessments contained stratified item samples drawn from extensive item pools, calibrated using Rasch models on the basis of examinee responses of a German nationwide assessment of secondary school language performance. The results suggest significant effects of item sampling strategies for the bookmark method on cut‐score recommendations, as well as significant cut‐score judgment revision over cut‐score placement rounds. Results are discussed within a framework of establishing validity evidence supporting cut‐score recommendations using the widely employed bookmark method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号