首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 625 毫秒
1.
通过定性和定量研究,分析传统完形填空中的定距删词和C—试题在中国英语考试中的效度。对比研究发现,在试题的编写、评分和信度方面,C—试题优于传统完形填空,它简单、经济、客观、信度高。传统完形填空则在效度方面高于C—试题,建议将其作为一种替换题型用于综合英语能力考试中。C—试题可作为一种词汇练习形式用于课堂练习或词汇测试中。  相似文献   

2.
《Africa Education Review》2013,10(3):365-385
Abstract

Are South Africans financially literate, and how can this be measured? Until 2009 there was no South African financial literacy measure and, therefore, the aim was to develop a South African measurement instrument that is scientific, socially acceptable, valid and reliable. To achieve this aim a contextual and conceptual analysis of financial literacy that indicated the importance of financial literacy, the scope and impact of financial literacy education, and uncovered an acceptable financial literacy definition and its constituent concepts, was applied. A rigorous five-step process was then followed in developing a questionnaire that measures financial literacy knowledge, behaviour and attitude. This draft questionnaire was applied at the South African Military Academy (SAMA) to firstly determine and improve its validity and reliability, and secondly to measure the financial literacy levels of school leavers. Experts and users found this measurement instrument to be valid, and internal consistency levels of above .7 registered its reliability. On average the first-year SAMA students achieved scores of 55.55%, 69.85%, and 77.11% for financial literacy knowledge, behaviour and attitude. As a result it is postulated that there is now a scientific and socially relevant, valid and reliable South African financial literacy measurement instrument available.  相似文献   

3.
What are the practical implications of small decreases in reliability coefficients? How does increased item local dependence decrease reliability? How does the new format of more “authentic” reading tests affect reliability?  相似文献   

4.
The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this regard is a measure produced by dividing the standard error of measurement by the test's ‘reliability length’, the latter defined as the maximum possible score minus the most probable score obtainable by blind guessing alone. This, however, can be unsatisfactory with negative marking (formula scoring), as shown by data on 13 negatively marked true/false tests. In these the examinees displayed considerable misinformation, which correlated negatively with correct knowledge. Negative marking can improve test reliability by penalizing such misinformation as well as by discouraging guessing. Reliability measures can be based on idealized theoretical models instead of on test data. These do not reflect the qualities of the test items, but can be focused on specific test objectives (e.g. in relation to cut‐off scores) and can be expressed as easily communicated statements even before tests are written.  相似文献   

5.
Can validity and reliability be taught from the perspective of the decisions classroom teachers make instead of from a more purely psychometric perspective? What aspects of validity and reliability are particularly relevant for classroom teachers?  相似文献   

6.
Recent studies have increasingly favoured contextualisation of religious education (RE) to pupils’ home faith background in spite of current assessment methods that might hinder this. For a multi-religious, multi-ethnic sample of 369 London school pupils aged from 13 to 15?years, this study found that the participatory, transformative and dialogical activities of church visits, computer use and classroom debate improved attitude to RE. It revealed more readiness in girls to apply RE to their own religiosity and particularly negative attitudes to RE in pupils with no religious background. Besides indicating the validity, reliability and unidimensionality of a new short quantitative measure of pupil attitude to RE which acknowledges pupil experience and home context, the findings suggest ways to move beyond ‘banking’ paradigms to which RE remains prone.  相似文献   

7.
将copula函数引入可靠性理论中的一些典型不可修串并联系统,用以度量部件相依时系统的可靠度,给出了相依时系统度量的改进,并在理论上分析了独立与相依时系统可靠度的优劣。  相似文献   

8.
How can we combine a multiple‐choice assessment with a performance assessment to yield a single score? What alternatives are there for weighting components of a test? What effect does reliability and validity have in component weighting?  相似文献   

9.
In repeated measure studies with unidimensional scales, measurement invariance, and specificity stability over time, the specificity variance in each instrument component can be identified. This article describes for that setting an improved point and interval estimation procedure for the maximal reliability coefficient associated with a given set of homogeneous measures. The method is developed within the framework of latent variable modeling and can also be readily used in longitudinal studies for improved point and interval estimation of individual measure reliability and scale reliability at each assessment occasion. The procedure is based on empirically testable conditions and is illustrated with an example.  相似文献   

10.
Undergraduate grade point average (GPA) is a commonly employed measure in educational research, serving as a criterion or as a predictor depending on the research question. Over the decades, researchers have used a variety of reliability coefficients to estimate the reliability of undergraduate GPA, which suggests that there has been no consensus on the most appropriate model. This paper reviews the assumptions of different reliability models and examines the effect of violating these assumptions on reliability estimates of GPA. Using longitudinal semester GPA data for 62,122 students from 26 four-year institutions, the reliability estimates for semester, annual, and fourth-year cumulative GPA ranged between .60–.65, .75–.79, and .89–.92, respectively. Depending on the measure, up to eight different reliability coefficients were estimated. In general, different estimates resulted in minor differences even when the assumptions of the underlying models are not met; however, larger differences were observed for the fourth-year cumulative GPA analyses.  相似文献   

11.
What components should be included in a portfolio of student work? How can the contents of portfolios be scored? How is reliability affected as the number of entries in the portfolio increases? What will it cost to obtain reliable scores?  相似文献   

12.
This article provides a review of some important milestones in the history of reliability, some current issues concerning reliability, and some likely prospects for reliability, from the perspective of one central question: “What constitutes a replication of a measurement procedure?” Special attention is given to the fixed/random aspects of facets that characterize replications.  相似文献   

13.
评价指标信度和效度检验是保证学生网上评教系统有效性的重要环节。为检验测度量表的可靠性和测量所需特性的程度,获得包括四个院系、十二个专业,31295个有效个案的样本。采取相关分析、主成分分析检验学生评教指标体系测度量表的信度和效度。研究发现指标体系测度量表信度很高,但效度不理想,建议采取建立专门机构定期检查指标体系的有效性和适度更新指标体系、增加二级指标和开放性评价选项等提高评价指标体系的信度与效度。  相似文献   

14.
In recent years nonverbal immediacy has received considerable attention from researchers concerned with instructional communication, interpersonal communication, and organizational communication. Unfortunately, the instruments used to measure nonverbal immediacy in these contexts sometimes have been problematic in terms of their reliability estimates. This research attempted to overcome this problem, or failing that, to identify the cause(s) of the reduced reliability. The research resulted in a scale with high reliability when used as either a self‐report or an other‐report measure. It was also found to be equally reliable across the contexts of instructional, interpersonal, and organizational communication. Content validity of the scale is good and an initial test of predictive validity produced a high validity correlation. Unexpected sex differences were observed in the results and these are discussed in this report.  相似文献   

15.
ABSTRACT

Ageism is a problem in aging societies. Clinical psychologists and undergraduate psychology students have shown negative attitudes toward older adults. However, no speci?c measure against ageist myths in the psychotherapeutic context is available. This study aims to develop and present the psychometric properties of the Ageist Myths about Psychotherapy Questionnaire (AMPQ).

These issues were examined by surveying 222 psychology graduates at higher education institutions about their attitudes and behaviors concerning psychotherapy with older adults, negative stereotypes toward aging, and attitudes toward dementia.

Using principal components analysis, 10 items were retained and one factor was obtained with an acceptable reliability index. Signi?cant associations were found between the AMPQ and negative stereotypes toward aging, and attitudes toward dementia.

Results revealed that universities and colleges with psychology programs have an ageist bias. Implications for college formation in aging, and older adults with mental health problems, are discussed and presented.  相似文献   

16.
A decision-theoretic approach to the question of reliability in categorically scored examinations is explored. The concepts of true scores and errors are discussed as they deviate from conventional psychometric definitions and measurement error in categorical scores is cast in terms ofmisclassifications. A reliability measure based on proportional reduction in loss (PRL) is then presented and exemplified with data from a large-scale assessment. The link between the PRL approach and the classical conception of reliability is discussed. Some design considerations for reliability studies are also discussed.  相似文献   

17.
How can the contributions of raters and tasks to error variance be estimated? Which source of error variance is usually greater? Are interrater coefficients adequate estimates of reliability? What other facets contribute to unreliability in performance assessments?  相似文献   

18.
黄玉麒  黄芳 《海外英语》2014,(21):91-92
As a subjective test, the validity of writing test is acceptable. What about the reliability? Writing test occupies a special position in the senior high school entrance examination(SHSEE for short). It is important to ensure its reliability. By the analysis of recent years’ English writing items in SHSEE, the author offer suggestions on how to guarantee the reliability of writing tests.  相似文献   

19.
The Social Skills Rating System (SSRS; F.M. Gresham & S.N. Elliott, 1990) is a norm‐referenced measure of students' social and problem behaviors. Since its release, much of the published reliability and validity evidence for the SSRS has focused primarily on the Teacher Report Form. The purpose of this study was to explore reliability and validity evidence of scores on the SSRS‐Student Elementary Form (SSRS‐SEF) for children in Grades 3 to 5. Findings provided support for the use of Total scale as a measure of student social behavior for initial screening purposes; however, evidence for the subscales was not as strong as predicted. Directions for future research regarding reliability and validity of scores from the SSRS‐SEF are discussed. © 2005 Wiley Periodicals, Inc. Psychol Schs 42: 345–354, 2005.  相似文献   

20.
This research note reports an attempt to develop a measure of higher education teachers' repertoire of teaching methods. It summarises: ? the rationale for wanting such a measure; ? the stages of development of the inventory; ? data from the use of the inventory with 141 teachers; ? relationships between TMI scale scores and scale scores on two other instruments: the ATI and SEEQ; ? problems with the inventory and proposals for the development of an alternative way of measuring repertoire.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号