期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Quality Control for Scoring Tests Administered in Continuous Mode: An NCME Instructional Module

Michal Baumer 《Educational Measurement》2017,36(1):58-68

Quality control (QC) in testing is paramount. QC procedures for tests can be divided into two types. The first type, one that has been well researched, is QC for tests administered to large population groups on few administration dates using a small set of test forms (e.g., large‐scale assessment). The second type is QC for tests, usually computerized, that are administered to small population groups on many administration dates using a wide array of test forms (CMT—continuous mode tests). Since the world of testing is headed in this direction, developing QC for CMT is crucial. In the current ITEMS module we discuss errors that might occur at the different stages of the CMT process, as well as the recommended QC procedure to reduce the incidence of each error. Illustration from a recent study is provided, and a computerized system that applies these procedures is presented. Instructions on how to develop one's own QC procedure are also included. 相似文献

2.

Standardisation and diversity in international assessments: barking up the wrong tree?

César Guadalupe 《Critical Studies in Education》2017,58(3):326-340

ABSTRACT

This article organises potential areas of criticism or challenges embedded in the design and administration of standardised assessments of learning levels in order to promote dialogue and research on educational assessments. The article begins by addressing debates around epistemological claims: issues that pertain to testing in general and issues that are particular to standardised testing. Then, it addresses some political attributes of international tests so as to situate the debates beyond feasibility, attributes and scope-related issues. The article claims that the field of education testing has identified a number of issues and challenges stemming from diversity, and has developed methods and procedures to address many of them. From this viewpoint, testing is just like any other domain of scientific enquiry. However, international assessments of learning outcomes are not necessarily, or primarily, scientific endeavours; they are political devices and therefore should be scrutinised considering scientific attributes as well as some political features that, even if intertwined with technicalities, go well beyond them. Thus, critiques of international assessments would be better framed if their political attributes are taken as organising principles of the criticism, alongside those elements that pertain to their technical attributes, since these are not incidental but deeply interlinked. 相似文献

3.

在线考试：机遇与挑战

易青杨志明《考试研究》2005,(3)

在线考试是一种通过网络来实现的计算机考试。本文就这种考试的特点、模式、应用范围,以及它在美国的应用情况和给中国带来的机遇和挑战等问题进行了初步探讨。相似文献

4.

Filling in the Blanks

《学校用计算机》2013,30(1-2):157-168

Abstract

This paper reviews the history of technology and testing. The role and functions of computers in education have become more varied, from drill and practice to simple tutorials to WebQuests. However, one important aspect of teaching for which the computer is ideally suited, achievement testing, is often overlooked. While it is not difficult to envision computers administering and scoring tests, there is also learning that occurs when tests are administered by a computer. Advantages and disadvantages of computer-based testing are also examined. Finally Type II applications are explored. 相似文献

5.

Revising the Online Classroom: Usability Testing for Training Online Technical Communication Instructors

Joseph Bartolotta Tiffany Bourelle Julianne Newmark 《Technical Communication Quarterly》2017,26(3):287-299

ABSTRACT

This article reports on an effort by the authors to use usability testing as a component of online teacher training for their multimajor technical communication course. The article further explains the ways in which program administrators at other institutions can create their own usability testing protocols for formative online teacher training in course design and in principles of user-centered design. 相似文献

6.

What is computer-based testing washback,how can it be evaluated and how can this support practitioner research?

《Journal of Further & Higher Education》2012,36(9):1255-1270

ABSTRACT

With the introduction of a new initiative in a teaching and learning environment there is an ethical responsibility to consider whether the impact of the introduction has met its intended goals, and whether it has harmed those who are influenced by it. Technology and infrastructure developments have encouraged a continued growth in the development and introduction of computer-based tests (CBTs) in educational environments. In the educational assessment literature, enquiry into the impact of testing (of all types) is known as ‘washback’. This is a reference to the way in which a test might have a range of influences on learners and teachers prior to the test-taking event. This article reviews the literature on CBT washback and outlines a framework for studying its effects as it is introduced into educational contexts. We then outline a research framework that we have developed (based on the literature) that can be used to evaluate CBT washback. We go on to argue that, to fulfil its potential in supporting the development of change, the research framework needs to act as a mediating device that brings together teaching-practitioner and researcher perspectives. The framework that we propose conceptualises the nature of washback in CBT contexts, as well as the research process and the methods required to understand it. This framework provides an element of common ground between practitioners (i.e. teachers who are involved in a CBT development process) and external researchers, and supports collaboration at three distinct levels. 相似文献

7.

Preschool/Kindergarten teachers’ conceptions of standardised testing

Niek Frans W. J. Post C. E. Oenema-Mostert A. E. M. G. Minnaert 《Assessment in Education: Principles, Policy & Practice》2020,27(1):87-108

ABSTRACT

Standardised tests play an important role in early childhood (EC) education in many countries. Although teachers’ conceptions largely determine whether and how these instruments are used, research on this topic is scarce. As a result, factors that influence conceptions of standardised testing have remained largely unexplored. To examine teachers’ conceptions of standardised testing and aspects that may influence these conceptions, Brown’s CoA-III-A questionnaire was distributed to 97 EC educators. Based on their responses, a selection of six preschool/kindergarten teachers participated in a series of semi-structured interviews. Analyses of the questionnaire and the interviews indicated that the teachers did not see these tests solely as instruments for accountability or improvement. While some perceived the test as pleasant confirmation, others perceived the results as negative opposition to their own observations. The teachers’ conceptions were influenced by classroom population, management team, and the ascribed purpose of the test. 相似文献

8.

High stakes testing and teacher access to professional opportunities: lessons from Indonesia

Ashadi Ashadi 《教育政策杂志》2013,28(6):727-741

Abstract

High-stakes testing regimes, in which schools are judged on their capacity to attain high student results in national tests, are becoming common in both developed and developing nations, including the United States, Britain and Australia. However, while there has been substantial investigation around the impact of high-stakes testing on curriculum and pedagogy, there has been very little research looking at the impact on teachers’ professional opportunities. The current project used a case study approach to examine the impact a high-stakes national testing programme had on teachers’ access to professional learning and their teaching allocations in four Indonesian public schools. It found that better qualified teachers were allocated to classes that would be sitting for the national examinations, and that these teachers were given much more access to professional learning opportunities than those teaching non-examined year levels. This in turn impacted negatively on the staff morale of less qualified teaching staff and potentially on their career trajectories. Findings suggest that school leaders should be wary of targeting better qualified and/or more experienced staff to year levels sitting for high-stakes tests, as this may lead to staff stratification within schools, limiting opportunities for staff to learn from one another and reducing the morale of less qualified and less experienced staff. They also add support to a substantial body of research that suggests policy-makers should be wary of the flow-on effects of using performance in high-stakes tests as the key means of judging school effectiveness. 相似文献

9.

Evaluating the Comparability of Paper‐ and Computer‐Based Science Tests Across Sex and SES Subgroups

Jennifer Randall Stephen Sireci Xueming Li Leah Kaira 《Educational Measurement》2012,31(4):2-12

As access and reliance on technology continue to increase, so does the use of computerized testing for admissions, licensure/certification, and accountability exams. Nonetheless, full computer‐based test (CBT) implementation can be difficult due to limited resources. As a result, some testing programs offer both CBT and paper‐based test (PBT) administration formats. In such situations, evidence that scores obtained from different formats are comparable must be gathered. In this study, we illustrate how contemporary statistical methods can be used to provide evidence regarding the comparability of CBT and PBT scores at the total test score and item levels. Specifically, we looked at the invariance of test structure and item functioning across test administration mode across subgroups of students defined by SES and sex. Multiple replications of both confirmatory factor analysis and Rasch differential item functioning analyses were used to assess invariance at the factorial and item levels. Results revealed a unidimensional construct with moderate statistical support for strong factorial‐level invariance across SES subgroups, and moderate support of invariance across sex. Issues involved in applying these analyses to future evaluations of the comparability of scores from different versions of a test are discussed. 相似文献

10.

Empirical Considerations on Intelligence Testing and Models of Intelligence: Updates for Educational Measurement Professionals

Kurt F. Geisinger 《教育实用测度》2019,32(3):193-197

ABSTRACT

This brief article introduces the topic of intelligence as highly appropriate for educational measurement professionals. It describes some of the uses of intelligence tests both historically and currently. It argues why knowledge of intelligence theory and intelligence testing is important for educational measurement professionals. The articles that follow in this special issue will provide readers with considerable information about the history of intelligence theory and testing, and especially of the Cattell-Horn-Carroll (CHC) model of testing and its implementation. The following articles will also provide a well-reasoned approach to the way science should work in evaluating tests and the models on which they are based. 相似文献

11.

Conducting statistical tests with data from clustered school samples

《International Journal of Research & Method in Education》2013,36(2):113-124

This article discusses issues associated with statistical testing conducted with data from clustered school samples. Empirical researchers often conduct tests of statistical inference on sample data to ascertain the extent to which differences exist within groups in the population. Typically, much school‐related data are collected from students. These data are hierarchical because students are nested within classes within schools. This article studies the influence of this nesting on tests of statistical significance conducted with the student as the unit of analysis. Theory that adjusts F‐test scores for nested data in multi‐group comparisons is presented and applied to a teacher interaction dataset. The article demonstrates the potential impact of data hierarchy on the results of statistical testing if clustering is ignored. Data analysis techniques that recognize the clustering of students in classes are essential, and it is recommended that either multilevel analysis or adjustments to statistical parameters be undertaken in studies involving nested data. 相似文献

12.

A Preliminary Methodological Verbal Computer Content Analysis Study of Preschool Black Children

《The Journal of educational research》2012,105(6):236-240

Abstract

The contentive verbal language of groups of three-, four-, and five-year-old black children was contrasted by utilizing the computerized General Inquirer System and the associated Harvard III Psychosociological Dictionary. Significant differences were found between groups in some contentive categories. Computerized verbal content analysis appears to be a fruitful method of testing and studying theoretical issues and in conducting empirical educational research. 相似文献

13.

Mother tongues: the Opt Out movement’s vocal response to patriarchal,neoliberal education reform

Stephanie Schroeder Elizabeth Currin Todd McCardle 《Gender and education》2018,30(8):1001-1018

ABSTRACT

This article explores the widespread and growing public backlash against high-stakes standardised testing in the United States, following the parent-led Opt Out movement’s quest to dismantle neoliberal educational policy by coaching children to boycott standardised tests. We analyse how our participants, mothers and female teachers in Opt Out Florida, use Facebook group pages as on-going critical sites of consciousness development where connected learning, knowing, and action occur. We illustrate how our participants, perceiving their children’s teachers as muzzled by neoliberal, patriarchal education reform, banded together to collectively attack a corporatised and violent system of American public education. Our focus on the role of mothers, their defence of teachers, and their attack on patriarchal neoliberalism fits within the larger history of the feminisation of the teaching profession and reveals how mothers in the domestic sphere have organised to wrest teaching from neoliberal reformers. 相似文献

14.

Reliability,validity, and all that jazz

Dylan Wiliam 《Education 3-13》2013,41(3):17-21

Summary

In this article my purpose has not been to indicate what kinds of things can and can't be assessed appropriately with tests. Rather, I have tried to illuminate how the key ideas of reliability and validity are used by test developers and what this means in practice — not least in terms of the decisions that are made about individual students on the basis of their test results. As I have stressed throughout this article, these limitations are not the fault of test developers. However inconvenient these limitations are for proponents of school testing, they are inherent in the nature of tests of academic achievement, and are as real as rocks. All users of the results of educational tests must understand what a limited technology this is. 相似文献

15.

Learning outdoors or with a computer: the contribution of the learning setting to learning and to environmental perceptions

Ester Aflalo Revital Montin Ayala Raviv 《Research in Science & Technological Education》2020,38(2):208-226

ABSTRACT

Background: Outdoor learning and computer-based learning are two different alternatives to in-class conventional teacher-centered learning.

Purpose: This study compares the outdoor learning setting with computer-based learning in class. It examines the influence of the two different learning settings on academic achievements, the learning experience, and pro-environmental perceptions.

Sample: A total of 90 elementary school students (third and fourth-grade classes) participated in the study.

Design and methods: The academic knowledge of the study participants was tested through identical exams for both learning settings. In addition, in each group the students’ perceptions were examined by means of a questionnaire about environmental values and the learning experience.

Results: The study demonstrates that academic achievements in the two settings were similar, but the students expressed more enthusiasm about the outdoor learning experience than about in-class learning. In addition, the outdoor learning setting contributed more to promoting positive environmental perceptions even though students did not learn directly about environmental issues and sustainability.

Conclusions: These findings suggest that learning in the natural environment is valuable: Alongside the fostering of computerized learning, it is also important to promoteoutdoor learning settings and integrate both settings by implementing mobile technologies in the outdoor teaching. 相似文献

16.

Mental testing and educational streaming in Ontario and Denmark in the early twentieth century: a comparative and transnational perspective

Patrice Milewski Christian Ydesen 《Paedagogica Historica: International Journal of the History of Education》2019,55(3):371-390

ABSTRACT

This article compares and contrasts the use of mental testing and the formation of educational streaming in Denmark and Ontario during the interwar years. In this sense, the article adds nuances to the meaning of internationalism as well as contributing to our knowledge about how ideas of testing practices circulated among countries and continents. One way ideas and practices circulated was via informal networks promoted by the education traveller. Key proponents of mental testing in both Denmark and Ontario travelled to continental Europe, England, and the United States studying and observing the practices and institutional arrangements associated with educational streaming. Our main findings are that the processes used to implement mental testing in the two countries differed significantly. Mental testing was implemented much later in Denmark than in Ontario. This was due to different contextual, cultural, and historical factors that promoted changes to the existing system, or, alternatively, represented a barrier or even obstructed changes to it. Nevertheless, mental testing was implemented in both education systems as a relatively coherent technology rooted in transnational movements and exchange, but was attended by highly different practices and local meaning-making. 相似文献

17.

Modeling Change in Effort Across a Low-Stakes Testing Session: A Latent Growth Curve Modeling Approach

Carol L. Barry Sara J. Finney 《教育实用测度》2013,26(1):46-64

ABSTRACT

We examined change in test-taking effort over the course of a three-hour, five test, low-stakes testing session. Latent growth modeling results indicated that change in test-taking effort was well-represented by a piecewise growth form, wherein effort increased from test 1 to test 4 and then decreased from test 4 to test 5. There was significant variability in effort for each of the five tests, which could be predicted from examinees’ conscientiousness, agreeableness, mastery approach goal orientation, and whether the examinee “skipped” or attended the initial testing session. The degree to which examinees perceived a particular test as important was related to effort for the difficult, cognitive test but not for less difficult, noncognitive tests. There was significant variability in the rates of change in effort, which could be predicted from examinees’ agreeableness. Interestingly, change in test-taking effort was not related to change in perceived test importance. Implications of these results for assessment practice and directions for future research are discussed. 相似文献

18.

Hispanic-Serving Community Colleges

Lee Waller Herlinda M. Glasscock Ronnie L. Glasscock Patsy J. Fulton-Calkins 《Community College Journal of Research & Practice》2013,37(5-6):463-478

The article examines student tuition, ad valorem property taxes, and state appropriations utilizing a revenue-per-contact-hour model to identify disparities in the Texas' community college funding mechanism. Methodology is presented to identify differences between and among Caucasian-serving, African-American-serving, Hispanic-serving, and other public community colleges. Statistical Packages for the Social Sciences (SPSS) was utilized to conduct multiple-factor analysis of variance (ANOVA) on the data set. The statistical testing utilized a significance level of 0.05. Posthoc tests were performed where necessary. 相似文献

19.

Computerized Adaptive and Fixed-Item Testing of Music Listening Skill: A Comparison of Efficiency, Precision, and Concurrent Validity

Walter P. Vispoel Tianyou Wang Timothy Bleiler 《Journal of Educational Measurement》1997,34(1):43-63

We evaluated the efficiency, precision, and concurrent validity of results obtained from adaptive and fired-item music listening tests in three studies: (a) a computer simulation study in which each of 2,200 simulees completed a computerized adaptive tonal memory test, a computerized fired-item tonal memory test constructed from items in the adaptive test pool and two standardized group-administered tonal memory tests; (b) a live testing study in which each of 204 examinees took the computerized adaptive test and the standardized tests; and (c) a live testing study in which randomly equivalent groups took either the computerized adaptive test (n = 86) or the computerized fired-item test (n = 86). The adaptive music test required 50% to 93% fewer items to match the reliability and concurrent validity of the fired-item tests, and it yielded higher levels of reliability and concurrent validity than the fired-item tests when test length was held constant. These findings suggest that computerized adaptive tests, which typically have been limited to visually produced items, may also be well suited for measuring skills that require aurally produced items. 相似文献

20.

Team-Based Testing Improves Individual Learning

Jane S. Vogler Daniel H. Robinson 《Journal of Experimental Education》2016,84(4):787-803

In two experiments, 90 undergraduates took six tests as part of an educational psychology course. Using a crossover design, students took three tests individually without feedback and then took the same test again, following the process of team-based testing (TBT), in teams in which the members reached consensus for each question and answered until they were correct. Students took the other three tests individually with feedback. All students were individually tested over a portion of this content two weeks later and again after two months. Independent samples t tests revealed that TBT students scored higher when retested two months later than those who took the test individually. Finally, three-fourths of the students reported that they enjoyed TBT more than individual testing. Although TBT requires more class time to administer, it appears to be beneficial for long-term student learning. 相似文献