首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 718 毫秒
1.
本文用教育测量学的方法定义了教育评估的信度和效度,分析了教育评估中可能出现的各种偏差及其对评估信度和效度的影响,提出了防止和修正评估偏差、提高教育评估信度和效度的途径和方法。  相似文献   

2.
信度和效度是评估、检测试卷质量的两个重要指标。本文通过对西华大学大学英语校内测试信度、效度分析,阐述了影响信度、效度的各种因素,最后根据分析数据和实践经验提出了如何提高测试信度和效度的几点建议。  相似文献   

3.
谈成人教育评估的信度和效度张立民(机械工业部深圳培训中心)由于成人教育不同于普通教育,有许多自身的特点,因而对其进行的教育评估也应有别于对普通教育进行的教育评估。本文试图从教育评估的信度与效度这一层面上,对成教评估作一点探讨。一、教育评估的信度和效度...  相似文献   

4.
本文简单地介绍了评估语言测试最为关键的两个质量指标:信度和效度,在论证了信度与效度的关系后,探索了如何达到信度和效度之间的平衡以获得测试最大的总效用,并对语言测试的改革实践提出了一些参考性建议.  相似文献   

5.
大学英语四级考试是一项能力测试。在能力测试中,信度和效度是评估大学四级考试的两个主要标准。测试的成功与否,在很大程度上取决于这两个标准的高低,因而应力争提高其测试的信度与效度。分析改革后四级考试听力部分的信度和效度对大学英语听力教学也有一定的指导意义。  相似文献   

6.
基于国际期刊《高等教育评估与评价》(AEHE)的文献计量分析发现:高等教育评估的国际研究呈现出以学生为中心的研究倾向、将反馈作为研究重点、探索多样评估方法和技术、重视形成性评估等四个共性特征;澳大利亚研究更为关注高等教育评估设计,英国研究更为关注学生的高等教育学业成就,美国研究更为关注信度和效度问题,中国研究则更为重视学生感知与看法;国际研究可以被划分为教学与学习评估、科研与学术评估、评估信度与效度等三类议题,其中,教学与学习评估、评估信度与效度两个领域将继续作为学界研究的“基本面”,而科研与学术诚信的评估研究将成为新的学术“增长点”。未来我国有关研究要深化实证研究方法的探索,注意议题领域的微观化与聚焦化,更加关注研究的信度和效度,重视数字时代的高等教育评估理论与实践。  相似文献   

7.
在简析IBDP语言与文学课的课程性质、课程内容、评估目标和评估方案的基础上,从评估的信度和效度两方面对IB语言与文学课与高考语文进行了比较分析,研究发现IBDP语言与文学课的评估方式具有较高的效度和信度。最后,在高考语文如何制定评估方法和评估标准等方面提出了一些具体的建议。  相似文献   

8.
《教师教育研究》2017,(5):81-88
本研究从结构性质量、过程性质量、结果性质量三个维度来建构幼儿园评估指标体系,采用项目分析、探索性因素分析和验证性因素分析对幼儿园保教质量评估指标体系开展了标准化检验,分析了保教质量评估指标重要性程度问卷的信度和效度,采用了主成分分析法计算了保教质量评估指标的权重。结果表明:幼儿园保教质量可以划分为三个维度:结构性质量、过程性质量、结果性质量;研究所使用问卷的内部一致性信度、分半信度、重测信度、结构效度均较好,结构性指标、过程性指标、结果性指标的权重系数分别为0.14、0.61、0.25,过程性指标的权重系数相对更大。  相似文献   

9.
提高教育评估的信度和效度   总被引:4,自引:0,他引:4  
提高教育评估的信度和效度□任岫林按照国家教委《关于各类成人高等学校评估工作的意见》,河北电大已于1994年至1996年完成了对各市(地)、县级电大的评估工作。通过我省电大评估工作,笔者认为,提高教育评估的信度和效度,做好评估工作必将对电大教育的发展产...  相似文献   

10.
文章首先回顾了信度和效度的概念以及检测信度和效度的方法,以此为依据,将收集到的电脑评分和专家人工评分的数据进行了相关性分析、信度检验、重复性方差分析、独立样本t检验以及定性分析等各项分析,多方位地多元评分系统的信度和效度进行了验证。结果表明,本系统具有良好的内部一致性,信度较好,但是初评分比例较高时,信度较低;与专家评分的结果对比研究表明,自动评分系统结果对说明文和应用文体两种文体写作能力解释力较差。  相似文献   

11.
分数不确切代表被试的真实语言能力的问题是语言测量学界一个最本质、最棘手的问题——效度问题。以往我们采取的一些诸如增加评分员数量、重评等办法虽然在一定程度上改善了效度,但是却都无法从真正意义上得到一个与真分数尽可能近似的客观的分数。Longford针对主观评分中的信度问题提出了四种分数调整模型来解决这一问题。本文运用严厉度调整模型对HSK高等作文评分中的异常评分者所评的分数进行了调整,调整后分数得到很大改善。因此在以后的考试当中基本上可以用这种数学的调整方法代替以往组织评分员重评的方法。  相似文献   

12.
主观测试实施过程中,由于存在多种因素导致最终测试结果的信度和效度降低,因此,对影响测试信度和效度各种因素的发现和分析就显得格外重要.本文主要介绍基于试题反应理论的多侧面模式产生背景、基本框架、在国内外教育测评上的典型应用以及此模式的局限性,从而说明多侧面模式作为一种新的测评模式,可以较全面地找出影响测试信度和效度的因素,特别是评分员主观效应因素,并能够对其进行客观分析.近年来,该模式在国内外教育测评上的应用也越来越广泛.  相似文献   

13.
This article concentrates on the validity and reliability of portfolio assessment as used in pre‐service teacher education. It is not possible to make general pronouncements about the validity of portfolio assessment in pre‐service teacher education as there are multiple portfolio applications. The validity depends on the purpose, namely the divers competencies which the course organisers wish to assess with it. Therefore, three categories of competencies and consequently three types of portfolios were distinguished in order to determine the validity of portfolio assessment. For the assessment of teaching and partnership competencies, it is argued that the validity is low due to the roundabout nature of the assessment. On the contrary, the validity of portfolio assessment for learning competencies can be high. The execution of a self‐regulated learning process can be accurately assessed using portfolios. The reliability of portfolio assessment is problematic, since it is incapable of fulfilling the classic psychometric requirement of reliability. Nevertheless, provided that the necessary measures are taken, the reliability of portfolio assessment can still be brought to an acceptable level. Five measures are proposed.  相似文献   

14.
Peer assessment exercises yield varied reliability and validity. To maximise reliability and validity, the literature recommends adopting various design principles including the use of explicit assessment criteria. Counter to this literature, we report a peer assessment exercise in which criteria were deliberately avoided yet acceptable reliability and validity were achieved. Based on this finding, we make two arguments. First, the comparative judgement approach adopted can be applied successfully in different contexts, including higher education and secondary school. Second, the success was due to this approach; an alternative technique based on absolute judgement yielded poor reliability and validity. We conclude that sound outcomes are achievable without assessment criteria, but success depends on how the peer assessment activity is designed.  相似文献   

15.
《Assessing Writing》2004,9(3):190-207
Specialists in the field of large-scale, high-stakes writing assessment have, over the last forty years alternately discussed the issue of maximizing either reliability or validity in test design. Factors complicating the debate–such as Messick's (1989) expanded definition of validity, and the ethical implications of testing–are explored. An inverse relationship between the loss of reliability and the loss of validity of a test is proffered. The term, Quality, in reference to writing assessment is defined and introduced. Construct complexity is hypothesized as a factor that influences validity, reliability, and quality. It is suggested that the either/or debate concerning emphasis over reliability or validity in test design be put aside in favor of a discussion on how to maximize the quality of an assessment. Insofar as this goal can be achieved, it is necessary in the design of the test to minimize and balance the loss of both validity and reliability. The discussion draws on literature from within the field of writing assessment and from works in the fields of mathematics and information theory.  相似文献   

16.
The terms accuracy and precision are consistently differentiated in the literature of engineering and the "hard" sciences. Precision shares a common core of meaning with reliability as used by behavioral scientists. Accuracy and validity have a similar semantic overlap. A review of the literature in educational and psychological measurement reveals an interchangeable usage of accuracy and precision in defining reliability, To help beginning students distinguish between validity and reliability, this paper advocates the use of precision, rather than accuracy, in describing reliability.  相似文献   

17.
The richness and complexity of video portfolios endanger both the reliability and validity of the assessment of teacher competencies. In a post-graduate teacher education program, the assessment of video portfolios was evaluated for its reliability, construct validity, and consequential validity. Although video portfolio facilitated a reliable and valid assessment of teacher competencies, procedures to improve assessment quality were also revealed and are therefore discussed: more explicit grounding of assessment results in the data, peer debriefing, prolonged engagement with the assessment data, cross-checking to find confirmatory or counter examples.  相似文献   

18.
“学生评教”作为教师课堂教学评价体系的一部分,受到国内外高校的广泛重视。但师生对“学生评教”的认同度,直接关系到评教的成效。本文通过对笔者所在学校学生评教情况的调查,发现了在评价的效度和信度、评价指标体系、评价结果的处理等方面存在的问题,对其产生的原因进行了分析,提出了应加强宣传、提高评教者的素质、科学设计评价指标、慎重对待评价结果等建议。  相似文献   

19.
In today’s market-driven educational culture, universities are coming under increasing pressure to justify funding through the disclosure of measurable outcomes in education and research. One educational objective that receives particular attention is critical thinking, regarded as an essential skill in both academic and work environments. The assessment of critical thinking has become a significant enterprise, with a number of standardised tests available for both individuals and organisations. While these tests are based on well-known taxonomies of critical thinking, this paper argues that institutions should be wary of using them as a means to measure educational outcomes. First, they fail to take into account fundamentally contested issues within conceptions of critical thinking. They also have significant weaknesses in terms of validity and reliability. Finally, and most importantly, they provide only a limited assessment of critical thinking, failing to evaluate the skills exercised in real-life academic tasks. A more effective approach to critical thinking testing would be one implemented at a faculty level, with assessments carried out on coursework integral to the curricula of specific academic disciplines.  相似文献   

20.
This paper forms part of an exploration of assessment on one part‐time higher education (HE) course: an in‐service, professional qualification for teachers and trainers in the learning and skills sector which is delivered on a franchise basis across a network of further education colleges in the north of England. This paper proposes that the validity and reliability of portfolio‐based assessment, a key component of many HE programmes in addition to the course being researched here, is contestable. Analysis of the processes of compiling portfolios for assessment, through the conceptual framework of the New Literacy Studies, suggests that the ways in which portfolios are assessed and the ways in which the crucial requisites of validity and reliability are assigned to them, mask complexities and contradictions in their creation by the student. This paper argues for a new, critical analysis of portfolio production and raises a number of questions about the validity, reliability and authenticity of the assessment process that the portfolios reify.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号