首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
黑盒测试和白盒测试都是软件测试的重要方法,黑盒测试的测试人员更偏重于业务方向,白盒测试的测试人员更偏重于实现方式;黑盒测试更注重整体,白盒测试更注重局部;它们是相辅相成的.  相似文献   

学校教育考试不仅是教育评价的技术手段,且是教育活动的有机组 成部分,是一种特殊的"化人"活动。学校教育考试价值及其内容出现偏差,不在 于人们是否知晓考试的价值及内容,而在于人们是否视自己为考试的受益者,是 否视考试为人们的生存方式和发展方式。教育者应养成充分把握、科学运用考试 的职业素养和行为习惯,受教育者应自觉参加考试,自信挑战考试,娴熟驾驭考 试,以考试规范自我行为,把增知长才与人格完善有机结合,把学习由兴趣逐步 引向乐趣、志趣.使自己成为全面发展、健康快乐的人。这是学校教育考试当试且 可试之内容。  相似文献   

通过计算机辅助口语测试和面试的相关性研究,探讨计算机辅助口语测试与面试的可替代性,同时结合问卷调查,了解学生对计算机辅助口语测试的态度、影响口语测试的因素。研究结果表明,计算机辅助口语测试和面试的成绩显著相关,两者具有可替代性;大多数学生认同计算机辅助口语测试这一形式;影响口语测试的主要因素是题目和材料的熟悉程度。  相似文献   

This study examines the effects of using item response theory (IRT) ability estimates based on customized tests that were formed by selecting specific content areas from a nationally standardized achievement test. Subsets of items were selected from four different subtests of the Iowa Tests of Basic Skills (Hieronymus, Hoover, & Lindquist, 1985) on the basis of (a) selected content areas (content-customized tests) and (b) a representative sampling of content areas (representative-customized tests). For three of the four tests examined, ability estimates and estimated national percentile ranks based on the content-customized tests in school samples tended to be systematically higher than those based on the full tests. The results of the study suggested that for certain populations, IRT ability estimates and corresponding normative scores on content-customized versions of standardized achievement tests cannot be expected to be equivalent to scores based on the full-length tests.  相似文献   

英语口语课堂测评与英语口语教学   总被引:1,自引:0,他引:1  
口语测试越来越受到广大教育工作者的重视,口语测评的课堂化趋势也越来越强。针对目前英语口语教学过程中存在的不足,笔者论述了口语课堂测评的意义、形式、原则、内容和方法。重点探讨了口语课堂测评不同于普通口语测试的原则和口语课堂测评中对于学生6个重点方面的评价内容。  相似文献   

Two experiments involving 72 elementary school-age children (mean age = 10 years, 3 months, range = 9,4-11,8) and 72 adults compared the ability of participants to choose positive and negative diagnostic tests over positive and negative nondiagnostic tests. Both experiments employed novel test materials, which resolved any issues regarding the effects of context on test strategy employment. In Experiment 1, both children and adults were significantly more likely to prefer positive diagnostic tests over positive nondiagnostic tests; however, only adults demonstrated a significant preference for negative diagnostic tests over positive nondiagnostic ones. In Experiment 2, both children and adults were more likely to choose negative diagnostic tests over negative nondiagnostic tests, demonstrating that despite a strong positive test bias, children could reason diagnostically in selecting among negative tests in cases in which only negative test choices were available.  相似文献   

学业成绩考试的信度分析   总被引:1,自引:0,他引:1  
考试信度对于任何一种有效考试来说都是必不可少的,只有信度高的考试才能使教师对学生的评价客观、可靠,考试成绩才能正确地反映被试者的程度。教育测量学、教育统计学在理论上为考试的科学化和现代化奠定了基础,使得考试分析数量化,而SPSS社科统计软件又使广大教师使用计算机进行学业成绩考试信度的定量分析成为可能。  相似文献   

Nonparametric and robust statistics (those using trimmed means and Winsorized variances) were compared for their ability to detect treatment effects in the 2-sample case. In particular, 2 specialized tests, tests designed to be sensitive to treatment effects when the distributions of the data are skewed to the right, were compared with 2 nonspecialized nonparametric (Wilcoxon-Mann-Whitney; Mann &; Whitney, 1947; Wilcoxon, 1949) and trimmed (Yuen, 1974) tests for 6 nonnormal distributions that varied according to their measures of. skewness and kurtosis. As expected, the specialized tests provided more power to detect treatment effects, particularly for the nonparametric comparison. However, when distributions were symmetric, the nonspecialized tests were more powerful; therefore, for all the distributions investigated, power differences did not favor the specialized tests. Consequently, the specialized tests are not recommended; researchers would have to know the shapes of the distributions that they work with in order to benefit from specialized tests. In addition, the nonparametric approach resulted in more power than the trimmed-means approach did.  相似文献   

对于全国性测试,经常性的评估是必不可少的。语言测试评估、有效性研究的关键是信度或一致性研究。本研究使用TEM4平行试卷,分别进行信度统计、差异分析。它不仅检验了平行测试之间的一致性问题,还在有差异的情况下,对有差异的测试或题项进行定位。这种定位对以后的测试编制、预测及拼卷将起到积极的作用。  相似文献   

随着我国基础教育改革的不断深入,标准化测验受到了越来越多的批评与质疑,如何全面认识与理解标准化测验就显得尤为迫切。从源头上对美国标准化测验兴起的历史进行详尽的考察与分析,是对标准化测验进行全面、深入认识与理解的重要视角。美国的标准化测验兴起于19世纪末20世纪初,是这一时期美国教育研究和实践推崇定量化研究方法的直接产物,它不但是教育实践中评价学生的常用方法,更是教育研究的重要工具。  相似文献   

运用现代教育测量理论,对数学测验进行标准化控制,以提高考试的效度和信度,尽而实现数学教育测量的标准化、科学化.其一般原则也适合于其他学科测验,并给出与考试有关的几个问题.  相似文献   

In recent years there has been a debate over the alleged superiority of achievement tests over aptitude tests on the grounds that the first would be fairer for college admissions and less influenced by family background. The switch from aptitude tests to achievement tests in Chile presented a unique opportunity to examine this claim. Regression analysis was used to assess the impact of the change in test performance using data from seven cohorts of test-takers. The evidence does not support the superiority of achievement tests, particularly when these assess extensive contents.  相似文献   

目前我国大学的英语考试大都采用了多项选择题,且大至全国性的统考,小至学校一门课程的期末考试,多项选择题在试题中所占的比重可以高达85%。这种形式的考试还被冠以“标准化考试”、“是客观题”,并认为具有“阅卷省事”等特点,导致人们对此类试题产生不客观、不全面的认识。实践证明,此类试题的缺点和负面作用是客观存在的,在某种意义上还十分严重。因此,恰如其分地评价多项选择题,这对大学英语考试的正确导向、对英语人才的培养意义重大。  相似文献   

改革实验考核提高实验教学质量   总被引:5,自引:2,他引:3  
实验考核分日常考核、操作考试、卷面问答三部分。把握实验考核的准确性是实验教学中的难点,要做到考核内容科学合理,考核标准易于操作,才能如实反映教学水平,促进教学质量的提高。  相似文献   

大学英语教学应加强测试改革   总被引:1,自引:0,他引:1  
英语测试,作为英语教学的一种具体手段,在日常教学活动及其它活动中用途广泛。本文根据测试的目的和用途,主要介绍了四种不同的测试手段——水平考试、成绩考试、分班考试、诊断考试,并就测试具体实施过程中出现的测试内容设计、测试手段的选择以及如何处理测试与素质教育的关系提出了一些看法和改革措施。  相似文献   

随着计算机技术和互联网的发展,无纸化网络考试早已经被一些国际大型考试所采用,2008年我国开始实施的国家英语四级网络考试在英语教学届无疑引起了评估方式的一次重大变革。但是许多高校目前仍然没有意识到网络无纸化测试的优势和实施的必要性。通过比较近4000名学生参加的两种不同考试,能够发现无纸化测试在节能、提高工作效率和工作质量等方面存在的明显的优势。  相似文献   

Although word recognition tests continue to be widely employed by teachers, the application of out‐of‐date norms and varying methods of administration and scoring seriously reduce the usefulness of these tests. Largely as a result of approaches from teachers, it was decided to undertake a large‐scale restandardization of the Burt‐Vernon and Schonell tests to produce reliable norms and to standardize administration and scoring procedures. The opportunity was also taken to record the data in such a way as to enable the production of revised orders of words on both tests to reflect contemporary usage, and to relate reading attainment, as assessed by the tests, to three important factors ‐‐ sex, school organization and socioeconomic status.  相似文献   

A total of ten first-grade teachers in an objectives-based reading program utilized on a biweekly basis three types of criterion tests: (1) individually-administered, constructed-response tests; (2) group-administered, selected-response tests with three choices per item; and (3) group-administered, selected-response tests with four choices per item. Scores on these tests and scores on an end-of-year, constructued-response posttest were collected on a sample of 40 Ss for each type of test. Both the individually-administered, constructed-response tests and the 4-choice, selected-response tests provided scores that accurately predicted end-of-year performance. The 3-choice tests did not. The use of 3-choice, selected-response tests in similar type instructional programs is not recommended.  相似文献   

How has Item Response Theory helped solve problems in the development and use of computer-adaptive tests? Do we need to balance item content with computer-adaptive tests? Could we use IRT to evaluate unusual responses to computer-delivered tests?  相似文献   

The possibility of using diagnostic tests was explored and several tests were used with civil engineering students to see if they would be useful in selecting students for the civil engineering degree course. The performance of students on the tests used was compared with the examination results and it was found that the tests showed a little promise. It will be necessary to design a test specifically for civil engineering students. On the other hand perhaps our methods of teaching and examining subjects fundamental to civil engineering should be examined in depth. The results from the tests used indicated a culture gap between home and overseas students and the need for a database of student information.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号