首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Student performance in and attitudes towards oral and written assessments were compared using quantitative and qualitative methods. Two separate cohorts of students were examined. The first larger cohort of students (n = 99) was randomly divided into ‘oral’ and ‘written’ groups, and the marks that they achieved in the same biology questions were compared. Students in the second smaller cohort (n = 29) were all examined using both written and oral questions concerning both ‘scientific’ and ‘personal development’ topics. Both cohorts showed highly significant differences in the mean marks achieved, with better performance in the oral assessment. There was no evidence of particular groups of students being disadvantaged in the oral tests. These students and also an additional cohort were asked about their attitudes to the two different assessment approaches. Although they tended to be more nervous in the face of oral assessments, many students thought oral assessments were more useful than written assessments. An important theme involved the perceived authenticity or ‘professionalism’ of an oral examination. This study suggests that oral assessments may be more inclusive than written ones and that they can act as powerful tools in helping students establish a ‘professional identity’.  相似文献   

2.
The purpose of the investigation was to determine whether and how the quality of students’ explanations of chemical phenomena was affected by changing the method of giving the question and answer, respectively, between the spoken and written formats. It also focused on these effects for different questions and different students. Two experiments used the same population of first‐year university students who attempted the same set of questions, variously presented and answered in each of the four different ways of combining these two methods of questioning and answering. From the results, it was concluded that, for these students and questions, there was no observable difference between the performance of the students using any of these combinations of formats.

  相似文献   

3.
Practical examinations in anatomy are usually conducted on specimens in the anatomy laboratory (referred to here as the “traditional” method). Recently, we have started to administer similar examinations online using the quiz facility in Moodle?. In this study, we compare student scores between two assessment environments viz. online and traditional environments. We hypothesized that regardless of the examination medium (traditional or online) overall student performance would not be significantly different. For the online medium, radiological images, prosected specimens, and short video clips demonstrating muscle action were first acquired from resources used for teaching during anatomy practical classes. These were optimized for online viewing and then uploaded onto Moodle learning management software. With regards to the traditional format, actual specimens were usually laid out in a circular stream. Identification tags were then attached to specific spots on the specimens and questions asked regarding those identified spots. A cohort of students taking practical examinations in six courses was studied. The courses were divided into three pairs with each pair credit‐weight matched. Each pair consisted of a course where the practical examination was conducted online and the other in the traditional format. There was no significant difference in the mean scores within each course pair. In addition, a significant positive correlation between score in traditional and online formats was found. We conclude that mean grades in anatomy practical examination conducted either online or in the traditional format were comparable. These findings should reassure teachers intending to use either format for their practical examinations. Anat Sci Educ. © 2011 American Association of Anatomists.  相似文献   

4.
This purpose of the study was to determine the effects of teachers using the Question Exploration Routine (QER) in regularly scheduled secondary‐level English Language Arts classes to help students answer questions about the development and use of main ideas in Shakespeare's Romeo and Juliet. Questions were posed in both multiple‐choice and written formats. On average, students representing diverse groups who received instruction with the QER correctly answered a significantly higher percentage of total questions than students receiving traditional lecture‐discussion instruction (effect sizes were very large). Results are reported for total scores, subscores of multiple‐choice and written responses, and for responses of subgroups of students. Students’ mean satisfaction and confidence responses are reported.  相似文献   

5.
The effects of synchronous and asynchronous lectures and interaction formats were examined with graduate business students in on‐campus and off‐campus MBA programs. The dependent variables were scores on exams questions and learning styles and cognitive styles were used as covariates. The results indicated significant differences for discussion and lecture format and for on‐campus and off‐campus students. The results were discussed relative to learning in electronic environments.  相似文献   

6.
Open–ended counterparts to a set of items from the quantitative section of the Graduate Record Examination (GRE–Q) were developed. Examinees responded to these items by gridding a numerical answer on a machine-readable answer sheet or by typing on a computer. The test section with the special answer sheets was administered at the end of a regular GRE administration. Test forms were spiraled so that random groups received either the grid-in questions or the same questions in a multiple-choice format. In a separate data collection effort, 364 paid volunteers who had recently taken the GRE used a computer keyboard to enter answers to the same set of questions. Despite substantial format differences noted for individual items, total scores for the multiple-choice and open-ended tests demonstrated remarkably similar correlational patterns. There were no significant interactions of test format with either gender or ethnicity.  相似文献   

7.
The mathematics achievement of a cohort of 955 students in 42 classes in six schools in London was followed over a 4‐year period, until they took their General Certificate of Secondary Education examinations (GCSEs) in the summer of 2000. All six schools were regarded by the Office for Standards in Education (Ofsted) as providing a good standard of education, and all were involved in teacher training partnerships with universities. Matched data on Key Stage 3 test scores and GCSE grades were available for 709 students, and these data were analysed in terms of the progress from Key Stage 3 test scores to GCSE grades. Although there were wide differences between schools in terms of overall GCSE grades, the average progress made by students was similar in all six schools. However, within each school, the progress made during Key Stage 4 varied greatly from set to set. Comparing students with the same Key Stage 3 scores, students placed in top sets averaged nearly half a GCSE grade higher than those in the other upper sets, who in turn averaged a third of a grade higher than those in lower sets, who in turn averaged around a third of a grade higher than those students placed in bottom sets. In the four schools that used formal whole‐class teaching, the difference in GCSE grades between top and bottom sets, taking Key Stage 3 scores into account, ranged from just over one grade at GCSE to nearly three grades. At the schools using small‐group and individualized teaching, the differences in value‐added between sets were not significant. In two of the schools, a significant proportion of working‐class students were placed into lower sets than would be indicated by their Key Stage 3 test scores.  相似文献   

8.
Responses to a 40-item test were simulated for 150 examinees under free-response and multiple-choice formats. The simulation was replicated three times for each of 30 variations reflecting format and the extent to which examinees were (a) misinformed, (b) successful in guessing free-response answers, and (c) able to recognize with assurance correct multiple-choice options that they could not produce under free-response testing. Internal consistency reliability (KR20) estimates were consistently higher for the free-response score sets, even when the free-response item difficulty indices were augmented to yield mean scores comparable to those from multiple-choice testing. In addition, all test score sets were correlated with four randomly generated sets of unit-normal measures, whose intercorrelations ranged from moderate to strong. These measures served as criteria because one of them had been used as the basic ability measure in the simulation of the test score sets. Again, the free-response score sets yielded superior results even when tests of equal difficulty were compared. The guessing and recognition factors had little or no effect on reliability estimates or correlations with the criteria. The extent of misinformation affected only multiple-choice score KR20's (more misinformation—higher KR20's). Although free-response tests were found to be generally superior, the extent of their advantage over multiple-choice was judged sufficiently small that other considerations might justifiably dictate format choice.  相似文献   

9.
Increasing numbers of universities are offering courses in online and hybrid formats. One challenge in online assessment is the maintenance of academic integrity. We present a thorough statistical analysis to uncover differences in student performance when online exams are administered in a proctored environment (i.e., in class) versus an unproctored environment (i.e., offsite). Controlling for student grade point average (GPA), no significant differences in mean overall course performance or exam performance between the two groups were found, nor were there any differences in the mean vectors of individual exam scores. The study reveals that the group taking online exams in the unproctored environment has significantly more variation in their performance results. In examining potential causes of the greater variation, analyses were performed to assess whether an increased level of possible cheating behavior could be observed from performance results for students in the unproctored section. No evidence of cheating behavior was found.  相似文献   

10.
Collaborative testing has been shown to improve performance but not always content retention. In this study, we investigated whether collaborative testing could improve both performance and content retention in a large, introductory biology course. Students were semirandomly divided into two groups based on their performances on exam 1. Each group contained equal numbers of students scoring in each grade category (“A”–“F”) on exam 1. All students completed each of the four exams of the semester as individuals. For exam 2, one group took the exam a second time in small groups immediately following the individually administered test. The other group followed this same format for exam 3. Individual and group exam scores were compared to determine differences in performance. All but exam 1 contained a subset of cumulative questions from the previous exam. Performances on the cumulative questions for exams 3 and 4 were compared for the two groups to determine whether there were significant differences in content retention. Even though group test scores were significantly higher than individual test scores, students who participated in collaborative testing performed no differently on cumulative questions than students who took the previous exam as individuals.  相似文献   

11.
In recent years, colleges have been moving from traditional, classroom‐based student evaluations of instruction to online evaluations. Because of the importance of these evaluations in decisions regarding retention, promotion and tenure, instructors are justifiably concerned about how this trend might affect their ratings. We recruited faculty members who were teaching two or more sections of the same course in a single semester and assigned at least one section to receive online evaluations and the other section(s) to receive classroom evaluations. We hypothesised that the online evaluations would yield a lower response rate than the classroom administration. We also predicted that there would be no significant differences in the overall ratings, the number of written comments, and the valence (positive/neutral/negative) of students’ comments. A total of 32 instructors participated in the study over two semesters, providing evaluation data from 2057 students. As expected, online evaluations had a significantly lower response rate than classroom evaluations. Additionally, there were no differences in the mean ratings, the percentage of students who provided written comments or the proportion of comments in the three valence categories. Thus, even with the lower response rate for online evaluations, the two administration formats seemed to produce comparable data.  相似文献   

12.
Implementation of learning outcomes in Universities has not been seamless – their alignment with assessments requires faculty to specify learning outcomes and show transparent evaluation of this alignment. Difficulties arise transitioning from grading within a range of pre-existing assessment formats to grading common learning outcomes across those different formats. Experiencing this, we performed a quantitative evaluation of students completing an academic literacy course. We aimed to see if student achievement was determined by assessment format or by learning outcomes, and then to identify whether achievement of a learning outcome was equivalent across assessment formats. A twostep cluster analysis identified three clusters of students: high, medium, and low achievers. Strongest predictors of cluster-membership were learning outcomes assessed in the written essay, however scores for each learning outcome when assessed across assessment formats correlated poorly. Faculty should ensure consistent standards in learning outcomes achievement when assessed across different formats, or clearly separate learning outcomes.  相似文献   

13.
This article presents the findings from two research studies. In Study I, 21 counseling students were given either a written standard model, written serial model, or a videotape model of how to ask tacting questions. While there were no differences between the written and video models, significant multivariate differences were found between the two forms of the written models. In Study II, 24 counseling students received either the written or video model and then were assessed either by orally responding or by writing reflections of feeling responses to a series of client vignettes. Once again, no differences were found for mode of model presentation. Students who responded in writing, however, outperformed those who responded orally. The discussion focuses on the implications these two studies have for counselor education.  相似文献   

14.
Bloom's taxonomy was adopted to create a subject‐specific scoring tool for histology multiple‐choice questions (MCQs). This Bloom's Taxonomy Histology Tool (BTHT) was used to analyze teacher‐ and student‐generated quiz and examination questions from a graduate level histology course. Multiple‐choice questions using histological images were generally assigned a higher BTHT level than simple text questions. The type of microscopy technique (light or electron microscopy) used for these image‐based questions did not result in any significant differences in their Bloom's taxonomy scores. The BTHT levels for teacher‐generated MCQs correlated positively with higher discrimination indices and inversely with the percent of students answering these questions correctly (difficulty index), suggesting that higher‐level Bloom's taxonomy questions differentiate well between higher‐ and lower‐performing students. When examining BTHT scores for MCQs that were written by students in a Multiple‐Choice Item Development Assignment (MCIDA) there was no significant correlation between these scores and the students' ability to answer teacher‐generated MCQs. This suggests that the ability to answer histology MCQs relies on a different skill set than the aptitude to construct higher‐level Bloom's taxonomy questions. However, students significantly improved their average BTHT scores from the midterm to the final MCIDA task, which indicates that practice, experience and feedback increased their MCQ writing proficiency. Anat Sci Educ 10: 456–464. © 2017 American Association of Anatomists.  相似文献   

15.
The study examined the relationships between learning patterns and attitudes towards two assessment formats: open‐ended (OE) and multiple‐choice (MC), among students in higher education. Sixteen Semantic Differential scales measuring emotional reactions, intellectual reactions and appraisal of each assessment format, along with measures of learning processes, academic self‐concept and test anxiety, were administered to 58 students. Results indicated two patterns of relationships between the learning‐related variables and the assessment attitudes: high scores on the self‐concept measure and on the three measures of learning processes were related to positive attitudes towards the OE format but negative ones towards the MC format; low scores on the test anxiety measures were related to positive attitudes towards the OE format. In addition, significant gender differences emerged with respect to the MC format, with males having more favourable attitudes than females. Results were discussed in light of an adaptive assessment approach.  相似文献   

16.
Anatomical examinations have been designed to assess topographical and/or applied knowledge of anatomy with or without the inclusion of visual resources such as cadaveric specimens or images, radiological images, and/or clinical photographs. Multimedia learning theories have advanced the understanding of how words and images are processed during learning. However, the evidence of the impact of including anatomical and radiological images within written assessments is sparse. This study investigates the impact of including images within clinically oriented single-best-answer questions on students' scores in a tailored online tool. Second-year medical students (n = 174) from six schools in the United Kingdom participated voluntarily in the examination, and 55 students provided free-text comments which were thematically analyzed. All questions were categorized as to whether their stimulus format was purely textual or included an associated image. The type (anatomical and radiological image) and deep structure of images (question referring to a bone or soft tissue on the image) were taken into consideration. Students scored significantly better on questions with images compared to questions without images (P < 0.001), and on questions referring to bones than to soft tissue (P < 0.001), but no difference was found in their performance on anatomical and radiological image questions. The coding highlighted areas of “test applicability” and “challenges faced by the students.” In conclusion, images are critical in medical practice for investigating a patient's anatomy, and this study sets out a way to understand the effects of images on students' performance and their views in commonly employed written assessments.  相似文献   

17.
《Educational Assessment》2013,18(2):135-157
One of the reasons often cited h r the low average level of proficiency demonstrated by U.S. students on national and international assessments is that there are no consequences or stakes attached to performance on the tests and, therefore, students are not motivated to invest their best effort. In this study, money was chosen as an incentive, but we hoped that short written instructions would be almost as powerful as money and easier and more desirable to implement in the National Assessment of Educational Progress (NAEP). Our results indicate that, at least for Grade 8 participants, student effort can be increased by financial rewards offered at the time of test taking, and that such effort can result in an increase in NAEP math test scores. Thus, from a policy perspective, scores from low-stakes tests may not represent what the student knows. Rather, such scores represent what students will demonstrate with minimal effort  相似文献   

18.
The study was conducted to explore performance on a variety of mental computation tasks using two presentation formats (visual and oral). Students at four grade levels between grades 2 and 9 in three countries (Australia, Japan, United States) were given a group administered mental computation test consisting of two parts (oral presentation format, visual presentation format).The sample of nearly 2000 students represents 6 classes at each of four grade levels in each country. Results indicate a wide variation in performance within the sample of each country at each grade level. Differences in performance between countries are also apparent and may reflect variations in instructional focus on mental computation. In particular, Japanese students perform at a higher level at the early grades than do students in either of the other countries sampled. However, by grade 8 this difference narrows in the American sample, and vanishes for the Australian sample. Differences in performance related to presentation format were dramatic for particular items and non-existent for other items. The most consistent effect was found in the Japanese sample where the visual presentation format resulted in higher performance levels on most items.It is hypothesised that superior results on visually presented items are attributable to a greater reliance on use of the standard written algorithm, while superior results on orally presented items indicate a greater tendency to use invented mental algorithms.  相似文献   

19.
Writing assignments, including note taking and written recall, should enhance retention of knowledge, whereas analytical writing tasks with metacognitive aspects should enhance higher-order thinking. In this study, we assessed how certain writing-intensive “interventions,” such as written exam corrections and peer-reviewed writing assignments using Calibrated Peer Review and including a metacognitive component, improve student learning. We designed and tested the possible benefits of these approaches using control and experimental variables across and between our three-section introductory biology course. Based on assessment, students who corrected exam questions showed significant improvement on postexam assessment compared with their nonparticipating peers. Differences were also observed between students participating in written and discussion-based exercises. Students with low ACT scores benefited equally from written and discussion-based exam corrections, whereas students with midrange to high ACT scores benefited more from written than discussion-based exam corrections. Students scored higher on topics learned via peer-reviewed writing assignments relative to learning in an active classroom discussion or traditional lecture. However, students with low ACT scores (17–23) did not show the same benefit from peer-reviewed written essays as the other students. These changes offer significant student learning benefits with minimal additional effort by the instructors.  相似文献   

20.
This study investigated the usefulness of the bifactor model in the investigation of score equivalence from computerized and paper-and-pencil formats of the same reading tests. Concerns about the equivalence of the paper-and-pencil and computerized formats were warranted because of the use of reading passages, computer unfamiliarity of primary school students, teacher versus computer administration of the tests, and slightly lower scores on the computerized format than on the paper-and-pencil format across all 4 grades. A confirmatory item factor analysis implemented through the bifactor model in TESTFACT indicated that the best-fitting model had a general factor as well as skill-group factors. This model was more consistent with the data than a model with 2 method factors, paper-and-pencil and computer administration. In addition, the general and skill factor loadings for most of the items were reasonable. Although several instances of negative loadings were found for items on the skill factors, these did not appear to have any practical importance. As a result, the bifactor model proved useful for studying paper-and-pencil and computerized score equivalence because of the reasonable results and delineation of loadings for the method and skill factors at the item level as well as for the general factor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号