首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Course-Faculty Instrument (CFI) demonstrates similar measurement properties with student populations at four diverse institutions. These students agree about the nature and extent to which course and instructor attributes relate to their learning. The results suggest that: (1) a perceived learning criterion may have general relevance to students, and (2) validity extension research is an economically feasible alternative to full-scale instrument development and validation efforts. Since validity extension is practical and facilitates cross-institutional comparisons, it appears to be a more viable strategy for researching and instituting student evaluation systems than is suggested by its current usage.  相似文献   

2.
A multi‐informant or multimethod approach has been suggested for use in educational evaluation and children's development assessment. However, in the study field of approaches to learning, most previous studies used one method to measure approaches to learning. In addition, compared with kindergarten and elementary children, younger children have received little attention. This study was dedicated to determining whether a multimethod approach (direct measure, teacher report, and parent report) was needed to assess preschool children's approaches to learning. A total of 713 preschool children were enrolled in this study. Correlations and multiple regressions were conducted to analyze the correlation among the three methods as well as their criterion validity based upon comparisons with an assessment of children's early childhood development. The results revealed significant but weak correlations among the three assessment methods. The direct measure of approaches to learning was more relevant to children's early childhood development than the teacher report and the parent report. The criterion validity of using the direct measure to assess preschool children's approaches to learning was also better than that of the teacher report and the parent report. Therefore, the direct measure was recommended for use in assessing preschool children's approaches to learning, and teacher report can be used as a supplement.  相似文献   

3.
4.
This paper describes the development and validation of an item bank designed for students to assess their own achievements across an undergraduate-degree programme in seven generic competences (i.e., problem-solving skills, critical-thinking skills, creative-thinking skills, ethical decision-making skills, effective communication skills, social interaction skills and global perspective). The Rasch modelling approach was adopted for instrument development and validation. A total of 425 items were developed. The content validity of these items was examined via six focus group interviews with target students, and the construct validity was verified against data collected from a large student sample (N?=?1151). A matrix design was adopted to assemble the items in 26 test forms, which were distributed at random in each administration session. The results demonstrated that the item bank had high reliability and good construct validity. Cross-sectional comparisons of Years 1–4 students revealed patterns of changes over the years. Correlation analyses shed light on the relationships between the constructs. Implications are drawn to inform future efforts to develop the instrument, and suggestions are made regarding ways to use the instrument to enhance the teaching and learning of generic skills.  相似文献   

5.
When reliability and validity were introduced as validation criteria for empirical research in the human sciences, quantitative research methods prevailed, and theory of science relied on neopositivism (Vienna Circle) or postpositivism (scientific realism). Within this worldview, notions of reliability and validity as criteria of scientific goodness were introduced. Reliability and validity were associated with the correspondence theory of truth, which is mostly ill-suited to the needs of qualitative research. For that reason, qualitative research must look for other kinds of validation criteria. The article elaborates the problems arising when the correspondence theory of truth is used as an ultimate criterion in evaluating qualitative research and proposes Heidegger's hermeneutical or alethetical idea of truth as a more suitable approach.  相似文献   

6.
Most discipline-based education researchers (DBERs) were formally trained in the methods of scientific disciplines such as biology, chemistry, and physics, rather than social science disciplines such as psychology and education. As a result, DBERs may have never taken specific courses in the social science research methodology—either quantitative or qualitative—on which their scholarship often relies so heavily. One particular aspect of (quantitative) social science research that differs markedly from disciplines such as biology and chemistry is the instrumentation used to quantify phenomena. In response, this Research Methods essay offers a contemporary social science perspective on test validity and the validation process. The instructional piece explores the concepts of test validity, the validation process, validity evidence, and key threats to validity. The essay also includes an in-depth example of a validity argument and validation approach for a test of student argument analysis. In addition to DBERs, this essay should benefit practitioners (e.g., lab directors, faculty members) in the development, evaluation, and/or selection of instruments for their work assessing students or evaluating pedagogical innovations.  相似文献   

7.
Background:?Validity theory has evolved significantly over the past 30 years in response to the increased use of assessments across scientific, social and educational settings. The overarching trajectory of this evolution reflects a shift from a purely quantitative, positivistic approach to a conception of validity reliant on the interpretation of multiple evidence sources integrated into validity arguments. Moreover, within contemporary validity, interpretation has been emphasised as a central process; however, despite this emphasis, there have been few explicit articulations of specific interpretive methodologies applicable to the practice of validation.

Purpose:?To link contemporary theoretical foundations in validity to practical methods and structures to help guide the collection and analysis of interpretive validity evidence. By building upon existing validity theory, this paper aims to provide greater clarity on the practice of validation and contribute toward the larger developing framework for the validation of educational assessments.

Source of evidence:?An interdisciplinary, integrative review of over 60 research articles and sources related to the theory and practice of educational validation and interpretive inquiry approaches. Sources include literature from the fields of educational assessment and more broadly social scientific research.

Main argument:?As assessments in education increasingly aim to measure complex constructs that are value-laden and socially dependant, validity theory must keep pace and evolve in ways that address the inherent complexities associated with contemporary educational assessment. Through this paper, I assert that a greater understanding of interpretive methodologies represents one of the most promising areas for development of validation theory and practice. Specifically, I argue that dialectic, hermeneutic and transgressive forms of inquiry can be integrated within current argument-based structures for the collection, analysis and representation of validity evidence in several useful ways.

Conclusions:?Interpretive inquiry processes, namely dialectic, hermeneutic and transgressive forms of interpretation, serve to expand validation practice to include diverse evidences for the generation of multiple-perspective validity arguments. The paper concludes with specific implications for future research and practice within the field of interpretive validity theory.  相似文献   

8.
The purpose of this study was to develop a questionnaire that could measure preservice mathematics teachers' mathematics educational values. Development and validation of the questionnaire involved a sequential inquiry in which design principles were established from the existing literature and a pool of items was constructed then submitted to experts for consideration of the construct validity. Alterations to the items based on their suggestions were made to produce a trial version of the questionnaire. A pilot study involving preservice mathematics teachers explored the validity and usefulness of the questionnaire. The pilot results were used to revise the questionnaire that was administered to a sample of preservice mathematics teachers attending Cumhuriyet University, Sivas, Turkey. Further explorations of the construct and structural validity, item contributions, and reliability were achieved by using a factor analysis and two different item analysis methods. Results revealed that the questionnaire included four factors, satisfactory item contributions, and acceptable internal consistency. One result obtained in this study suggested that some mathematics education values based on Western culture (e.g., accessibility–special) have not been accepted by Turkish preservice mathematics teachers.  相似文献   

9.
The purpose of this study was to test and validate the Engaged Teacher Scale (ETS) in a Turkish context (ETS-TR). In order to test the construct validity of the ETS, data were collected from 388 teachers in two northeast cities of Turkey. First-order confirmatory factor analysis results supported the 16-item and four-factor model of ETS while second-order confirmatory factor analysis suggested that a single factor was also appropriate for representing teacher engagement. Additionally, four multiple linear regression analyses were conducted to provide further validation evidence. Results showed that subscales of the ETS-TR were found to be positively correlated with teacher self-efficacy. Given our evidence of validity and reliability, we recommend researchers interested in measuring the engagement of Turkish teachers to consider using the ETS-TR. The adaptation of ETS into Turkish also provides a measure for use when conducting research examining cultural comparisons between english-speaking and Turkish teachers.  相似文献   

10.
Classwide supports were used to increase children's early literacy skills for Head Start morning and afternoon classrooms within a preschool Response to Intervention (RtI) model. Support included interventions to improve child outcomes for letter naming fluency (LNF) and teachers' instructional and managerial variables. Targeted activities included circle (a group instructional activity) and center times (also instructional time with children's choices about activities). Support for teachers included adding a “letter of the week” activity with active responding and other literacy activities and providing direct feedback to teachers about classroom managerial interactions. Results show positive classwide improvements for LNF, an increase in the prevalence of instruction, and improvements in positive and instructional managerial methods. Social validity appraisal indicated that the support methods were highly acceptable. © 2010 Wiley Periodicals, Inc.  相似文献   

11.
This study was designed to examine the underlying structure of the Children's Playfulness Scale (CPS). The CPS was administered to 602 children who were randomly divided into two groups (calibration and validation group). The calibration group (n= 279) included 137 boys and 142 girls, and the validation group (n= 323) included 162 boys and 161 girls, ranging in age from 4 to 6 years. A one-factor model was postulated and supported. According to the model, 5 variables measuring children's playfulness loaded on one factor (playfulness). In addition, the proposed model was found to be invariant across the two groups. Good cross-generalizability of the CPS appears to support its validity. Educators working in a preschool/kindergarten setting may use it with confidence when evaluating children's playfulness.  相似文献   

12.
The task of validating a teacher assessment and improvement system is similar whether the system operates in the United States or in another country. Chile has a national teacher evaluation system (NTES) that is standards based, uses multiple instruments, and is intended to serve both formative and summative purposes. For the past 6 years the authors have performed validation research on NTES using a variety of methods and data sources. This article describes our validation research agenda, the results of major validation studies, and an integration of the existing evidence, and it offers the authors' preliminary judgment about NTES's validity. The article also offers a critical reflection regarding the decisions taken while driving the long and winding validation road, and the lessons we learned during this politically and methodologically complex journey.  相似文献   

13.
1985年《教育与心理测验标准》(第5版)出版之前,效度研究的核心概念是"效标(criterion)",效度研究被视为一种用"效标"对测验的效度进行证明(verify)、对测验分数做出有效(valid)解释的过程。1985年以后,效度研究的核心概念是"证据(evidence)",效度研究被视为一种通过积累证据对测验的效度进行支持(support)、对测验分数做出合理(reasonable)解释的过程。关于效度的这种理解,突出体现在1999年出版的《教育与心理测验标准》(第6版)中。美国教育协会和美国国家教育测量学会共同组织编写的《教育测量》在业内被称为"教育测量领域的《圣经》"。2006年《教育测量》(第4版)出版以后,效度研究的核心概念演变为"理由(warrant)",效度研究被视为一种通过构造"理由系统"和"理由网络"对效度进行"论证(argument)"、对测验分数做出可接受的(plausible)解释的过程。本文结合笔者的考试实践,介绍了效度概念的新发展。  相似文献   

14.
The paper describes the development and validation of a group test of integrated process skills. The test assesses student performance on a set of twelve objectives related to the generic objective: planning and conducting an investigation. Evidence of content validity, construct validity, and reliability are presented in the paper. A range of generalizability coefficients from 0.77 to 0.98 is reported for specific uses of the 24-item test. Since the items measure performance on objectives that can be readily translated into classroom activity, the test has direct applicability to classroom based research, and evaluation of instruction. In addition to sound psychometric properties, the Test of Integrated Science Processes is distincitve because it includes a set of interrelated, cumulative objectives which reflect autonomous problem solving.  相似文献   

15.
The Standards for Educational and Psychological Testing identify several strands of validity evidence that may be needed as support for particular interpretations and uses of assessments. Yet assessment validation often does not seem guided by these Standards, with validations lacking a particular strand even when it appears relevant to an assessment. Consequently, the degree to which validity evidence supports the proposed interpretation and use of the assessment may be compromised. Guided by the Standards, this article presents an independent validation of OECD's PISA assessment of mathematical self-efficacy (MSE) as an instructive example of this issue. OECD identifies MSE as one of a number of “factors” explaining student performance in mathematics, thereby serving the “policy orientation” of PISA. However, this independent validation identifies significant shortcomings in the strands of validity evidence available to support this interpretation and use of the assessment. The article therefore demonstrates how the Standards can guide the planning of a validation to ensure it generates the validity evidence relevant to an interpretive argument, particularly for an international large-scale assessment such as PISA. The implication is that assessment validation could yet benefit from the Standards as what Zumbo calls “a global force for testing”.  相似文献   

16.
The aim of the present study was to adapt and validate the ISPCAN child abuse screening tool-retrospective version (ICAST-R) in Sri Lanka with a view to investigating the experiences of physical, sexual and emotional abuse during childhood.The adaptation was performed using qualitative research methods with young adults, parents, teachers, and a multidisciplinary group of experts. The translation to Sinhala (the local Sri Lankan dialect) was carried out by a nominal group technique. A multidisciplinary team of experts assessed the Sinhala ICAST-R (SICAST-R) for its content validity. Moreover, acceptability, reliability and construct validity were determined by conducting a validation study among 200 schooling young adults. The principal component analysis (PCA) technique was used to assess the construct validity. Response rates for each item were taken as evidence of acceptability. The internal consistency was assessed by Cronbach’s alpha, and test-retest reliability after two weeks was assessed using Cohen's kappa coefficient.The adaptation of ICAST-R included the introduction of an objective manner by which to measure severity of abuse and the inclusion of a set of questions regarding help-seeking behavior following physical and emotional abusive experiences.The SICAST-R showed adequate content validity and high acceptability, with response rates ranging from 90.3% to 99.5%. The minimum Cohen’s kappa coefficient was 0.76, indicating good test-retest reliability. The internal consistency (Cronbach’s alpha) for the total tool was 0.708, with the three constructs being 0.398, 0.844 and 0.637 for physical, sexual and emotional abuse, respectively. The PCA demonstrated good reproducibility for sexual and emotional abuse with the hypothesized structure.Overall, the SICAST-R showed adequate validity for the assessment of experiences of physical, sexual and emotional abuse during childhood among Sri Lankan young adults.  相似文献   

17.
Structural equation modeling (SEM) techniques provide us with excellent tools for conducting preliminary evaluation of differential validity and reliability of measurement instruments among a comprehensive selection of population groups. This article demonstrates empirically an SEM technique for group comparison of reliability and validity. Data are from a study of 495 mothers' attitudes toward pregnancy. Proportions of African American and White, married and unmarried, and Medicaid and non-Medicaid mothers provided sample sizes large enough for group comparisons. Four hypotheses are tested: that factor structures are invariant between subgroups, that factor loadings are invariant between subgroups, that measurement error is invariant between subgroups, and that means of the latent variable are invariant between subgroups. Discussion of item distributions, sample size issues, and appropriate estimation techniques is included.  相似文献   

18.
一种改进的粗集综合评价方法   总被引:1,自引:0,他引:1  
基于区分矩阵的粗集综合评价方法由于存在对评价对象的反复比较,因此影响了求解指标约简及权重的效率.利用区分矩阵的变形——广义信息表提出的一种改进的粗集综合评价方法,能够减少对对象的重复比较,更快地进行指标约简和权重设置.此外,通过将该方法应用于政府效率评估来验证了方法的可行性和有效性.  相似文献   

19.
Multiple traits of language proficiency as well as test method effects were concurrently analyzed to investigate interrelations of construct validity, convergent validity, and discriminant validity using multitrait-multimethod (MTMM) matrices. A total of 585 test takers' scores were derived from the field test of the Pearson Test of English Academic. An MTMM confirmatory factor analysis model was parameterized using 4 traits and 3 assessment methods. The 4 traits included listening, reading, speaking, and integrated skills, while the 3 methods included prescribed multiple-choice responses, constructed responses, and summarized responses. The trait factor loadings were systematically greater than those of methods, providing evidence that the indicators were strongly related to their latent constructs, after adjusting for the method effects. The results showed robust convergent validity, moderate discriminant validity, and insignificant method effects. Implications are discussed.  相似文献   

20.
《Educational Assessment》2013,18(2):149-165
Professional measurement standards have evolved during the past 5 decades, creating a more unitary yet nebulous conception of validation. Concurrently, due to the increase of high-stakes testing in public schools, the courts have been forced to rule on the appropriateness of decisions emanating from tests. However, the courts often have failed to apply current validation theory in rendering decisions, preferring the convenience and clarity of earlier perspectives of validity. This rift between validity theory and judicial interpretation threatens to grow into a chasm as more complex views of validation prevail in the profession. Modem measurement practitioners stand astride this chasm in their efforts to implement test validation procedures that are cost effective, legally defensible, and consistent with state-of-the-art theory.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号