首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Indirect tests of writing competency are often used at the college level for a variety of educational, programmatic, and research purposes. Although such tests may have been validated on hearing populations, it cannot be assumed that they validly assess the writing competency of deaf and hard-of-hearing students. This study used a direct criterion measure of writing competency to determine the criterion validity of two indirect measures of writing competency. Results suggest that the validity of indirect writing tests for deaf and hard-of-hearing baccalaureate-level students is weak. We recommend that direct writing tests be used with this population to ensure fair and accurage assessment of writing competency.  相似文献   

2.
As society becomes increasingly digital, teachers must be trained to integrate technology effectively into their classrooms. Teachers’ technological pedagogical knowledge (TPK), as defined in the TPACK framework, is considered an important prerequisite for effectively integrating technology. The TPACK framework has received a great deal of attention, yet few knowledge tests have been developed that directly assess TPK. However, those tests are crucial for evaluating the effectiveness of teacher education courses on technology integration. We have developed a 17-item test that covers teacher knowledge about various digital technologies as employed in teaching. Experts rated the items to represent the construct adequately. Data obtained from 245 pre-service teachers supports the test’s internal structure. Concerning convergent and discriminant validity, the pre-service teachers’ test scores were not related to their self-reported TPK, but to their self-reported technological knowledge. The test was sensitive to changes in pre-service teachers’ TPK through teacher education courses.  相似文献   

3.
This article concentrates on the validity and reliability of portfolio assessment as used in pre‐service teacher education. It is not possible to make general pronouncements about the validity of portfolio assessment in pre‐service teacher education as there are multiple portfolio applications. The validity depends on the purpose, namely the divers competencies which the course organisers wish to assess with it. Therefore, three categories of competencies and consequently three types of portfolios were distinguished in order to determine the validity of portfolio assessment. For the assessment of teaching and partnership competencies, it is argued that the validity is low due to the roundabout nature of the assessment. On the contrary, the validity of portfolio assessment for learning competencies can be high. The execution of a self‐regulated learning process can be accurately assessed using portfolios. The reliability of portfolio assessment is problematic, since it is incapable of fulfilling the classic psychometric requirement of reliability. Nevertheless, provided that the necessary measures are taken, the reliability of portfolio assessment can still be brought to an acceptable level. Five measures are proposed.  相似文献   

4.
The No Child Left Behind act resulted in an increased reliance on large-scale standardized tests to assess the progress of individual students as well as schools. In addition, emphasis was placed on including all students in the testing programs as well as those with disabilities. As a result, the role of testing accommodations has become more central in discussions about test fairness and accessibility as well as evidence of validity. This study seeks to examine whether there exists differential item functioning for math and language items between special education examinees receiving accommodations and those not receiving accommodations.  相似文献   

5.
States are increasingly requiring that public school teachers pass one or more tests as a condition for permanent employment. As a result of a recent federal court decision, these tests must now satisfy the same legal standards as other employment tests. Moreover, some of the measures used to assess teacher competence no longer rely on multiple-choice items. They now utilize various types of open-ended performance assessments. This article discusses how these developments may affect the adverse impact, reliability, validity, and pass-fail standards of teacher certification tests. The article concludes by recommending that such tests combine multiple-choice questions with open-end tasks that focus on the common or critical situations that are likely to arise across the full range of practice setting for which the teacher is being certified or licensed.  相似文献   

6.
States are increasingly requiring that public school teachers pass one or more tests as a condition for permanent employment. As a result of a recent federal court decision, these tests must now satisfy the same legal standards as other employment tests. Moreover, some of the measures used to assess teacher competence no longer rely on multiple-choice items. They now utilize various types of open-ended performance assessments. This article discusses how these developments may affect the adverse impact, reliability, validity, and pass-fail standards of teacher certification tests. The article concludes by recommending that such tests combine multiple-choice questions with open-end tasks that focus on the common or critical situations that are likely to arise across the full range of practice setting for which the teacher is being certified or licensed.  相似文献   

7.
Classroom observations are increasingly common in education policies as a means to assess the quality of teachers and/or education programs for purposes of making high-stakes decisions. This article considers one policy, the Head Start Designation Renewal System (DRS), which involves classroom observations to assess the quality of Head Start programs in order to decide whether their funding is renewed. This article applies an argument-based approach for evaluating the validity of observational assessments that (a) explicates assumptions that underlie the presumed logic, leading from the collection of scores from observations of Head Start classrooms, to the inference that scores assess the quality of Head Start programs, to the decision to renew funding to Head Start programs, and (b) summarizes evidence that speaks to the plausibility of each assumption. There was limited evidence to support the plausibility of many assumptions, including those pertaining to score generalizability, predictive validity, and the cutoff scores set as minimum standards of quality. Implications for improving the validity of classroom observations and the accuracy and fairness of decisions in the Head Start DRS are discussed.  相似文献   

8.
Abstract

Five reading diagnostic tests were administered to twenty-seven fourth-grade children in an attempt to assess their inter-relationships and comparative validity, Teacher ratings, standardized test scores, and grades were used as criteria. All tests had acceptable validity coefficients, although they were somewhat lower than previous results. The Bond, McCullough, and Doren tests were quite similar and their validities were somewhat higher than the Roswell and McKee. Vowel related subtests contributed most heavily to the relationship between tests and criteria, and reading-arithmetic relationships were frequently higher than reading-reading relationships.  相似文献   

9.
Higher education institutions are increasingly concerned about accreditation. Although sustainable market orientation (SMO) bears on academic accreditation, to date, no study has developed a valid scale of SMO or assessed its influence on accreditation. The purpose of this paper is to construct and validate an SMO scale that was developed in Egyptian faculties. SMO is identified as a one-dimensional construct consisting of four overlapping components. Using a survey, data were collected from 204 respondents in 6 Egyptian-accredited governmental faculties. Both item analysis and split-half methods were used to purify the measurement scale and assess its stability. Exploratory factor analysis was used to assess dimensionality, and confirmatory factor analysis was used to examine the construct and convergent/discriminant validity. Nomological validity was assessed with a structural equation model. Results suggest both a validated scale and empirical evidence of the influence of SMO on academic accreditation.  相似文献   

10.
The validity of most psychological and educational tests is established using correlational procedures examining the linear relationship between performance on the two instruments. Concurrent validity developed in this manner is commonly viewed as verification of the acceptability of a test. Few studies exist examining the degree to which test performance covaries with real-life performance appraisals. This study examined the concurrent validity of the WRAT-R and the K-TEA with teacher estimates of actual classroom levels of performance in reading and mathematics. Participants were 134 third and fourth graders enrolled in a regular education setting. In addition, this study compared the test performance of average students on two widely used standardized educational achievement tests in order to determine whether the tests yielded significantly different performance estimates relative to grade level functioning.  相似文献   

11.
Two parallel versions of a Test of Science Investigation Skills were developed to assess students' application of science investigation skills in biology and physics contexts. Repeated pilot testing and critical appraisal were used to ensure the validity of the tests and their equivalence. Both versions of the test were administered to 112 Year 10 science students. The results indicated a satisfactory level of test reliability, the test set in a physics context proved to be significantly more difficult than the test set in a biology context, and mean scores for male and female students were not significantly different. Specializations: science teacher education, development of problem-solving expertise, concept development and conceptual change, assessment of laboratory work. Specializations: Chemistry education, concept development and conceptual change, effective laboratory teaching.  相似文献   

12.
This article is part of a special LDRP research-to-practice series introducing key concepts to enable special education practitioners and other nonresearchers to be more informed research consumers. In the article, we explore how social validity is assessed in special education research and how to interpret social validity assessments. Rather than focusing on measuring intervention effects, social validity involves assessing the social importance of the goals, procedures, and outcomes of interventions and programs. We define social validity, provide questions to consider when examining assessments of social validity in research papers, review approaches commonly used to assess social validity with examples from the research literature, and make recommendations for reconciling findings of positive intervention effects on targeted outcomes but absent or negative findings related to social validity in a study. Our take-home message is that considering social validity assessments helps research consumers interpret study findings and informs how to apply findings in practice.  相似文献   

13.
Extensive but separate bodies of research in education concern the constructs of school climate and school connectedness/belonging. In the interests of advancing a more integrated approach, a new measurement tool is developed– the School Climate and School Identification Measure–Student (SCASIM-St). This scale builds on the Moos (1973) framework which assesses relationships, personal growth, and system management in schools. The social identity approach to group processes (Tajfel & Turner, 1979; Turner, Hogg, Oakes, Reicher, & Wetherell, 1987) is used to extend work on school connectedness and belonging through the inclusion of a measure of social identification. A range of methods across three studies are designed to assess the reliability and validity of SCASIM-St (N = 7209, Australian grades 7–10 students). These include confirmatory factor analysis, test-retest analysis, and convergent validity (Study 1 and 2). Additionally measurement invariance tests for student sub-groups regarding gender, grade level, and non-English language, were employed in Study 3. It also included criterion validity analysis using multilevel models for the key outcome measures of students’ academic achievement, well-being, and aggressive behaviors. All of these analyses indicate that SCASIM-St is an effective measure. Theoretical and practical implications as well as future directions are outlined.  相似文献   

14.
The National Assessment Program – Literacy and Numeracy (NAPLAN) in Australia is a series of literacy and numeracy tests that are used for purposes of school comparison. This paper argues that a key question for this use lies in whether or not this is a reasonable, or valid, use of the test data. Using Kane’s argumentative approach to validity, this paper argues that the comparisons of the quality of student achievement made available on the My School Website have low validity due to the lack of regard to rates of participation in schools. In bringing together the literature that addresses the ‘new governance’ of education through testing and an approach to validity that addresses the technical aspects of test score interpretation, with the ethics of how test scores are used and applied, this study identifies validity as an important consideration in comparative analyses of student achievement data. The identification of the need to consider participation in such comparisons through the application of the argumentative approach to validity highlights the contribution of this article not only to the testing field but also to critical policy literature.  相似文献   

15.
英语口语考试的信度和效度受口试形式、评分标准和考官素质等多方面因素的影响。提高英语口试的效度和信度,需坚持英语口试形式与内容的统一,设计出科学、客观并具有可操作性的评分标准。高信度与效度的英语口语测试对教学具有积极的反拨作用。  相似文献   

16.
A syllabus analysis instrument was developed to assist program evaluators, administrators and faculty in the identification of skills that students use as they complete their college coursework. While this instrument can be tailored for use with a variety of learning domains, we used it to assess students' use of and exposure to computer technology skills. The reliability and validity of the instrument was examined through an analysis of 88 syllabi from courses within the teacher education program and the core curriculum at a private Midwest US university. Results indicate that the instrument has good inter‐rater reliability and ratings by and interviews with faculty and students provide evidence of construct validity. The use and limitations of the instrument in educational program evaluation are discussed.  相似文献   

17.
Speededness refers to the situation where the time limits on a standardized test do not allow substantial numbers of examinees to fully consider all test items. When tests are not intended to measure speed of responding, speededness introduces a severe threat to the validity of interpretations based on test scores. In this article, we describe test speededness, its potential threats to validity, and traditional and modern methods that can be used to assess the presence of speededness. We argue that more attention must be paid to this issue and that more research must be done to set appropriate time limits on power tests so that speed of responding does not interfere with the construct measured.  相似文献   

18.
One way to assess the quality of education in post-secondary institutions is through the use of performance indicators. Studies that have compared currently popular process indicators (e.g., library size, percentage of faculty with PhD) found that after controlling for incoming student ability, these process indicators tend to be weakly associated with student outcomes (Pascarella and Terenzini, 2005). In addition, while much research has found that students increase their critical thinking skills as a result of attending college, little is known about what goes on during the college experience that contributes to this. The purpose of this research was to examine the validity of higher-order questions on tests and assignments as a process indicator by comparing it with gains in critical thinking skills among college students as an outcome indicator. The present research consisted of three studies that used different designs, samples, and instruments. Overall, it was found that frequency of higher-order questions can be a valid process indicator as it is related to gains in students’ critical thinking skills.  相似文献   

19.
跨化心理研究是我国民族教育研究的重要组成部分,以心理测量作为研究工具的占有很大的比例,而且研究所用的心理量表主要来自欧美国家。如何在我国多民族化背景下使用这些量表,如何在跨化心理研究中保证心理测量的信度、效度是关系到研究科学与否的重大问题。探讨跨化因素即时间、地域、语言与心理测量的关系及它们各自在心理测量中的作用,提出在民族教育跨化心理测量研究中必须贯彻化公平原则。  相似文献   

20.
To assess the concurrent validity of standardized achievement tests using teachers' ratings (and rankings) of pupils' academic achievement as criteria, 42 teachers evaluated each of their students (n = 1,032) in each of five major curricular areas prior to the administration of a battery of standardized achievement tests. The teachers were directed to rate each student's proficiency disregarding attendance, attitude, deportment, and so on. Within-class correlation coefficients were computed to eliminate rater leniency bias. The standardized achievement tests were found to have substantial concurrent validity in reading, math, language arts, science, and social studies. The normalized teacher ranks yielded significantly higher validity coefficients than did the ratings, although the magnitude of the difference was small. The concurrent validity coefficients for language arts, reading, and math were significantly higher than those in science and social studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号