首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
《教育实用测度》2013,26(4):321-335
This research evaluated two methods for structuring a performance domain for a certification test in emergency nursing. The first method utilized task frequency ratings obtained from a national sample of 659 emergency nurses who completed a job analysis survey consisting of 125 tasks. The second method was based on judgments of task similarity among the same 125 tasks; the similarity judgments were provided by a panel of 21 subject-matter experts. The two types of data, task frequency ratings and task similarity judgments, were subjected to cluster analysis and multidimensional scaling in order to derive a framework for organizing the 125 tasks into some limited number of performance categories. The similarity judgments produced fewer clusters and fewer dimensions and were more adequately modeled by both multivariate procedures. The results based on the similarity judgments were also easier to interpret within a conceptually meaningful framework. The results question the traditional approach of applying multivariate procedures to task frequency ratings for purposes of clustering tasks and structuring performance domains.  相似文献   

2.
The task inventory approach is commonly used in job analysis for establishing content validity evidence supporting the use and interpretation of licensure and certification examinations. Although the results of a task inventory survey provide job task-related information that can be used as a reliable and valid source for test development, it is often the knowledge, skills, and abilities (KSAs) required for performing the tasks, rather than the job tasks themselves, which are tested by licensure and certification exams. This article presents a framework that addresses the important role of KSAs in developing and validating licensure and certification examinations. This includes the use of KSAs in linking job task survey results to the test content outline, transferring job task weights to test specifications, and eventually applying the results to the development of the test items. The impact of using KSAs in the development of test specifications is illustrated from job analyses for two diverse professions. One method for transferring job task weights from the job analysis to test specifications through KSAs is also presented, along with examples. The two examples demonstrated in this article are taken from nursing certification and real estate licensure programs. However, the methodology for using KSAs to link job tasks and test content is also applicable in the development of teacher credentialing examinations.  相似文献   

3.
A method for combining multiple scale responses from job or task surveys based on a hierarchical ranking scheme is presented. A rationale for placing the resulting ordinal information onto an interval scale of measurement using the Rasch Rating Scale Model is also provided. After a simple linear transformation, the item or task parameter estimates can be used to obtain item weights to be used in constructing test blueprints. Prior weights can then be used to modify the item weights after data collection, based either on content balancing requirements or Bayesian prior content weights from SMEs (subject matter experts). Finally a method is suggested to link two or more surveys, again using the Rasch Rating Scale Model and the computer program, Bigsteps, when it is desirable to shorten the length of the typical job or task survey.  相似文献   

4.
Providing appropriate test accommodations to most English language learners (ELLs) is important to facilitate meaningful inferences about learning. This study compared teacher large-scale test accommodation recommendations to those from a literature- and practitioner-grounded accommodation selection taxonomy. The taxonomy links student-specific needs, strengths, and schooling experiences to large-scale test accommodation recommendations that differentially minimize barriers of access for students with different profiles. A blind panel of experts rated four sets of recommendations for each of 114 ELLs. Results found the taxonomy was a significantly better fit for distinguishing accommodations by student need than teacher recommendations. Further, the fit of teacher recommendations showed no difference when the teacher used a structured data collection procedure to gather profile information about each of their ELLs and when they did not, and teachers’ recommendations were not found to differ significantly from a random set of accommodations. Findings are consistent with previous literature that suggests the task of matching specific accommodations to individual needs, rather than the task of identifying individual needs, is where teachers struggle in recommending appropriate test accommodations.  相似文献   

5.
Abstract

In order to gain insight into preservice teachers' beliefs about planning for mathematics instruction, a study was carried out involving K‐8 teacher candidates enrolled in an elementary mathematics methods course. Doyle's (1992) notion of academic task and the research on pedagogical content knowledge served as the theoretical framework for this study. The teacher candidates submitted lesson plans at three intervals during a semester‐long methods course; the lesson plans were then coded based on candidates' planned uses of academic tasks. Analyses of the data revealed trends in these teacher candidates' design of academic tasks over the course of the semester. Recommendations and implications are pre‐sented highlighting the benefits of incorporating the knowledge base on academic task into a mathematics methods course as a means to con‐tribute to teacher candidates' developing pedagogical content knowledge via their designing of academic tasks in lesson planning.  相似文献   

6.
Corporate and educational settings increasingly address more decision-making, problem-solving and other complex cognitive skills to handle complex cognitive, or heuristic, tasks, but the ever-increasing need for heuristic knowledge has outpaced the refinement of task analysis methods for heuristic expertise. Utilizing the Heuristic Task Analysis (HTA) process, a method developed for eliciting, analyzing, and representing expertise in complex cognitive tasks, a formative research study was conducted on the task of group counseling to further improve the HTA process. Implications of the findings include the need for incorporating various interview strategies and techniques, developing strategies for working with multiple experts, and considering the level of task expertise of the analyst. A revised version of the HTA process is presented based on these implications.  相似文献   

7.
Practice analysis (i.e., job analysis) serves as the cornerstone for the development of credentialing examinations and is generally used as the primary source of evidence when validating scores on such exams. Numerous methodological questions arise when planning and conducting a practice analysis, but there is little consensus in the measurement community regarding the answers to these questions. This article offers recommendations concerning the following issues: selecting a method of practice analysis; developing rating scales to describe practice; determining the content of test plans; using multivariate procedures for structuring test plans; and determining topic weights for test plans. The article closes by suggesting several references for further reading.  相似文献   

8.
The alignment of test items to content standards is critical to the validity of decisions made from standards‐based tests. Generally, alignment is determined based on judgments made by a panel of content experts with either ratings averaged or via a consensus reached through discussion. When the pool of items to be reviewed is large, or the content‐matter experts are broadly distributed geographically, panel methods present significant challenges. This article illustrates the use of an online methodology for gauging item alignment that does not require that raters convene in person, reduces the overall cost of the study, increases time flexibility, and offers an efficient means for reviewing large item banks. Latent trait methods are applied to the data to control for between‐rater severity, evaluate intrarater consistency, and provide item‐level diagnostic statistics. Use of this methodology is illustrated with a large pool (1,345) of interim‐formative mathematics test items. Implications for the field and limitations of this approach are discussed.  相似文献   

9.
《Educational Assessment》2013,18(3):225-253
Because of plans for state-by-state reporting of 1992 reading data from the National Assessment of Educational Progress (NAEP), we investigated the adequacy of the process used to develop the assessment, the degree to which it represents a consensus among professionals in the reading field, and its content and curricular validity. To carry out this investigation, we analyzed documents produced by NAEP, convened a 2-day panel of experts, held two public colloquia, conducted 50 interviews, and analyzed responses to a questionnaire completed by 627 leading educators. We found that the planning process did not include enough time to address some major concerns of the field. Despite this, there was widespread agreement that the 1992 NAEP in Reading represents important advances in reading assessment, including more open-ended responses, more authentic texts, and student choice about passages. But these very advances raise problems for test design and the interpretation and scoring of student responses.  相似文献   

10.
During the development of large‐scale curricular achievement tests, recruited panels of independent subject‐matter experts use systematic judgmental methods—often collectively labeled “alignment” methods—to rate the correspondence between a given test's items and the objective statements in a particular curricular standards document. High disagreement among the expert panelists may indicate problems with training, feedback, or other steps of the alignment procedure. Existing procedural recommendations for alignment reviews have been derived largely from single‐panel research studies; support for their use during operational large‐scale test development may be limited. Synthesizing data from more than 1,000 alignment reviews of state achievement tests, this study identifies features of test–standards alignment review procedures that impact agreement about test item content. The researchers then use their meta‐regression results to propose some practical suggestions for alignment review implementation.  相似文献   

11.
文章针对教学质量评估指标权重的不完全信息以及指标评价值的不确定性,采用区间数表达专家给出的评价信息,建立了不完全信息下的教学质量评估模型。通过解多目标规划模型,客观地确定了评价指标的权重,从而提出了一种新的高校教学质量评估方法。应用实例表明了方法的有效性和实用性。  相似文献   

12.
The current study evaluated the use of virtual reality (VR) and augmented reality (AR) platforms, developed within the scope of the SKILLS Integrated Project, for industrial maintenance and assembly (IMA) tasks training. VR and AR systems are now widely regarded as promising training platforms for complex and highly demanding IMA tasks. However, there is a need to empirically evaluate their efficiency and effectiveness compared to traditional training methods. Forty expert technicians were randomly assigned to four training groups in an electronic actuator assembly task: VR (training with the VR platform twice), Control-VR (watching a filmed demonstration twice), AR (training with the AR platform once), and Control-AR (training with the real actuator and the aid of a filmed demonstration once). A post-training test evaluated performance in the real task. Results demonstrate that, in general, the VR and AR training groups required longer training time compared to the Control-VR and Control-AR groups, respectively. There were fewer unsolved errors in the AR group compared to the Control-AR group, and no significant differences in final performance between the VR and Control-VR groups, probably due to a ceiling effect created by the use of two training trials in the selected task for participants who were expert technicians. The results suggest that use of the AR platform for training IMA tasks should be encouraged and use of the VR platform for that purpose should be further evaluated.  相似文献   

13.
In the present study, information processing of test anxiety is explained within the framework of the ACT* model. The author used the speed-accuracy tradeoff method to investigate the effect of test anxiety on each subsystem of working memory. The sample was made up of 119 college students enrolled in an educational psychology course. Test anxiety affected performance on the verbal-analogies task but not on the rhyming-judgment and visual-spatial tasks. The participants' subvocalization of the rhyming words may have drawn attention to the task itself and preempted the effect of test anxiety on task performance. Also, the activation processes for the visual-spatial tasks may have occurred in a different dimension or separate from the verbal processes of test anxiety.  相似文献   

14.
Objectives‐based instructional design approaches break down tasks into specific learning objectives and prescribe that instructors should choose the optimal instructional method for teaching each respective objective until all objectives have been taught. This approach is appropriate for many tasks where there is little relation between the objectives, but less effective for teaching complex professional tasks that require the integration of knowledge, skills, and attitudes and the coordination of different skills. For the latter, a task‐centred approach that starts designing instruction from whole, real‐life tasks, is more appropriate. This article describes one task‐centred instructional design model, namely the Four‐Component Instructional Design (4C/ID) model and illustrates its application by reflecting on three educational programs in higher education designed with 4C/ID. The first case presents a design for a course that focuses on the development of mobile apps at the Amsterdam University of Applied Sciences in the Netherlands. The second case illustrates the integration of information problem‐solving skills at Iselinge University of Professional Teacher Education, a teacher training institute in the Netherlands. The third case is an example from general practice education at the KU Leuven, Belgium. Future developments and issues concerning the implementation of task‐centred educational programmes are discussed.  相似文献   

15.
There has been a steady interest in investigating the validity of language tests in the last decades. Despite numerous studies on construct validity in language testing, there are not many studies examining the construct validity of a reading test. This paper reports on a study that explored the construct validity of the English reading test in the Nepalese school leaving examination. Eight students were asked to take the test and think-aloud, followed by retrospective interviews. Additionally, seven experts were asked to make judgments regarding the skills tested by the test. The findings provide grounded insights into students’ response behaviors prompted by the reading tasks, and indicate some threats to the construct validity of the test. Additionally, the study reports a low level of agreement among the experts, and a big gap between the skills used by the students and the skills that the experts thought were being examined by the test.  相似文献   

16.
The majority of children and adults with reading disabilities exhibit pronounced difficulties on naming-speed measures such as tests of rapid automatized naming (RAN). RAN tasks require speeded naming of serially presented stimuli and share key characteristics with reading, but different versions of the RAN task vary in their sensitivity: The RAN letters task successfully predicts reading ability, whereas the RAN objects task does not reliably predict reading after kindergarten. In this study we used functional magnetic resonance imaging to evaluate the neural substrates that may underlie performance on these tasks. In two scans during the same test session, adult, average readers covertly rapidly named objects or letters or passively viewed a fixation matrix of plus signs. For both rapid naming tasks compared with fixation, activation was found in neural areas associated with eye movement control and attention as well as in a network of structures previously implicated in reading tasks. This reading network included inferior frontal cortex, temporo-parietal areas, and the ventral visual stream. Whereas the inferior frontal areas of the network were similarly activated for both letters and objects, activation in the posterior areas varied by task. The letters task caused greater activation in the angular gyrus, superior parietal lobule, and medial extrastriate areas, whereas object naming only preferentially activated an area of the fusiform gyrus. These results confirm that RAN tasks recruit a network of neural structures also involved in more complex reading tasks and suggest that the RAN letters task specifically pinpoints key components of this network.  相似文献   

17.
Combinations of five methods of equating test forms and two methods of selecting samples of students for equating were compared for accuracy. The two sampling methods were representative sampling from the population and matching samples on the anchor test score. The equating methods were the Tucker, Levine equally reliable, chained equipercentile, frequency estimation, and item response theory (IRT) 3PL methods. The tests were the Verbal and Mathematical sections of the Scholastic Aptitude Test. The criteria for accuracy were measures of agreement with an equivalent-groups equating based on more than 115,000 students taking each form. Much of the inaccuracy in the equatings could be attributed to overall bias. The results for all equating methods in the matched samples were similar to those for the Tucker and frequency estimation methods in the representative samples; these equatings made too small an adjustment for the difference in the difficulty of the test forms. In the representative samples, the chained equipercentile method showed a much smaller bias. The IRT (3PL) and Levine methods tended to agree with each other and were inconsistent in the direction of their bias.  相似文献   

18.
Performance assessments are typically scored by having experts rate individual performances. The cost associated with using expert raters may represent a serious limitation in many large-scale testing programs. The use of raters may also introduce an additional source of error into the assessment. These limitations have motivated development of automated scoring systems for performance assessments. Preliminary research has shown these systems to have application across a variety of tasks ranging from simple mathematics to architectural problem solving. This study extends research on automated scoring by comparing alternative automated systems for scoring a computer simulation test of physicians'patient management skills; one system uses regression-derived weights for components of the performance, the other uses complex rules to map performances into score levels. The procedures are evaluated by comparing the resulting scores to expert ratings of the same performances.  相似文献   

19.
Since 1971 there have been a number of studies in which a cut score has been set using a method proposed by Angoff (1971). In this method, each member of a panel of judges estimates for each test question the proportion correct for a specific target group of examinees. Prior and contemporary research suggests that this is a difficult task for judges. Angoff also proposed that judges simply indicate whether or not an examinee from the target group will be able to answer each question correctly (the yes/no method). We report on the results of two studies that compare a yes/no estimation with a proportion correct estimation. The two studies demonstrate that both methods produce essentially equal cut scores and that judges find the yes/no method more comfortable to use than the estimated proportion correct method.  相似文献   

20.
Seventy-one college general biology students were taught a unit in Mendelian genetics by the traditional lecture method. Emphasis was placed on meiotic formation of gametes, the Law of Segregation, and the Law of Independent Assortment. The Punnett-square model was used for all practice problems. Eight weeks later, a content-validated retention test was given to evaluate the students' retention of problem-solving skills. The test required students to use proportional reasoning (identifying ratios from the Punnett squares), combinatorial reasoning (identifying combinations of gametes from parental genotypes), and probabilistic reasoning (estimating gamete or offspring probabilities). Each of the 71 students was also given three Piagetian interview tasks to evaluate intellectual development in the areas of reasoning under question. The balance-beam task, the electronic switch-box task, and colored squares and diamonds were used to test for proportional reasoning, combinatorial reasoning, and probabilistic reasoning, respectively. Pearson correlations and factor analysis failed to show direct relationships among Piagetian tasks for the three kinds of reasoning and their corresponding occurrence in genetics problems. Some correlations were higher between different reasoning types than between similar types. Analysis of variance showed significant differences for all three reasoning types among concrete-operational, transitional, and formal-operational students with the retention test. Post-hoc analysis of ANOVAs indicated that formal-operational students had significantly more success in the three reasoning areas than transitional students, and transitional students had significantly more success than concrete-operational students.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号