Recent publications have drawn attention to the idea of utilizing prior information about the correlation structure to improve statistical power in cluster randomized experiments. Because power in cluster randomized designs is a function of many different parameters, it has been difficult for applied researchers to discern a simple rule explaining when prior correlation information will substantially improve power. This article provides bounds on the maximum possible improvement in power as a function of a single parameter, the number of clusters at the highest level of a multilevel experiment. The maximum improvement in power is less than 0.05 unless the number of clusters at the highest level is less than 20. Thus, the utility of using prior correlation information is limited to experiments with very small cluster-level sample sizes. Situations where small cluster-level sample sizes could still result in experiments with good statistical power are discussed, as is the relative utility of prior information about intracluster correlations as compared with covariate information that can explain cluster level variability in the outcome.  相似文献   

Gibbons and Chakraborti's (1991) interpretation of recent simulation results and their recommendations to researchers are misleading in some respects. The present note emphasizes that the Mann-Whitney test is not a suitable replacement of the Student t test when variances and sample sizes are unequal, irrespective of whether the assumption of normality is satisfied or violated. When both normality and homogeneity of variance are violated together, an effective procedure, not widely known to researchers in education and psychology, is the Fligner-Policello test or, alternatively, the Welch t' test in conjunction with transformation of the original scores to ranks.  相似文献   


In recent years, there has been growing interest in the role of affect within education. Within this paper, the authors make a distinction between affective pedagogy, which they refer to as ways of teaching that are designed to evoke particular emotional states, and affective knowledge, which they refer to as aspects of knowledge or knowing which seem to bring forth particular emotions organically. Using explicit grammar knowledge as a test case, they explore student teachers’ affective responses to learning, drawing upon interview data and observations made during a series of grammar courses. They argue that grammar learning is a potential source of pleasure, wonder and intensity. The findings provide an important counter-narrative to the prevailing discourse of grammar as dull and threatening. They also draw broader conclusions about the significance of affect in education, drawing upon affect theory and recent work on epistemic emotions.  相似文献   

Background and purpose : Knowing how students learn physics is a central goal of physics education. The major purpose of this study is to examine the strength of the predictive power of students’ epistemic views and conceptions of learning in terms of their approaches to learning in physics. Sample, design and method : A total of 279 Taiwanese high school students ranging from 15 to 18?years old participated in this study. Three questionnaires for assessing high school students’ epistemic views on physics, conceptions of learning physics and approaches to learning physics were developed. Step-wise regression was performed to examine the predictive power of epistemic views on physics and conceptions of learning physics in terms of their approaches to learning physics. Results and conclusion: The results indicated that, in general, compared to epistemic views on physics, conceptions of learning physics are more powerful in predicting students’ approaches to learning physics in light of the regression models. That is, students’ beliefs about learning, compared with their beliefs about knowledge, may be more associated with their learning approaches. Moreover, this study revealed that the higher-level conceptions of learning physics such as ‘Seeing in a new way’ were more likely to be positively correlated with the deep approaches to learning physics, whereas the lower-level conceptions such as ‘Testing’ were more likely to positively explain the surface approaches, as well as to negatively predict the deep approaches to learning physics.  相似文献   


This article develops a new approach for calculating appropriate sample sizes for school-based randomized control trials (RCTs) with binary outcomes using logit models with and without baseline covariates. The theoretical analysis develops sample size formulas for clustered designs where random assignment is at the school or teacher level using generalized estimating equation methods. The article focuses on the impact parameter pertaining to rates and proportions rather than to the log odds of response, which has been the focus of the previous literature. The article also compiles intraclass correlations (ICCs) for the clustered design for a range of binary outcomes using data from seven education RCTs. These ICCs and the power formulas are then used to conduct a power analysis using a provided SAS macro; the key finding is that sample sizes of 40 to 60 schools that are typically included in clustered RCTs for student test score or behavioral scale outcomes will often be insufficient for binary outcomes. A key reason is that the potential for precision gains from regression adjustment is likely to be smaller for binary outcomes.  相似文献   


Program effectiveness reviews in education seek to provide educators with scientifically valid and useful summaries of evidence on achievement effects of various interventions. Different reviewers have different policies on measures of content taught in the experimental group but not the control group, called here treatment-inherent measures. These are contrasted with treatment-independent measures of content emphasized equally in experimental and control groups. The What Works Clearinghouse (WWC) averages effect sizes from such measures with those from treatment-independent measures, while the Best Evidence Encyclopedia excludes treatment-inherent measures. This article contrasts effect sizes from treatment-inherent and treatment-independent measures in WWC reading and math reviews to explore the degree to which these measures produce different estimates. In all comparisons, treatment-inherent measures produce much larger positive effect sizes than treatment-independent measures. Based on these findings, it is suggested that program effectiveness reviews exclude treatment-inherent measures, or at least report them separately.  相似文献   


Two complementary approaches to developing empirical benchmarks for achievement effect sizes in educational interventions are explored. The first approach characterizes the natural developmental progress in achievement made by students from one year to the next as effect sizes. Data for seven nationally standardized achievement tests show large annual gains in the early elementary grades followed by gradually declining gains in later grades. A given intervention effect will therefore look quite different when compared to the annual progress for different grade levels. The second approach explores achievement gaps for policy-relevant subgroups of students or schools. Data from national- and district-level achievement tests show that, when represented as effect sizes, student gaps are relatively small for gender and much larger for economic disadvantage and race/ethnicity. For schools, the differences between weak schools and average schools are surprisingly modest when expressed as student-level effect sizes. A given intervention effect viewed in terms of its potential for closing one of these performance gaps will therefore look very different depending on which gap is considered.  相似文献   


The dynamics of knowledge in society have transformed the conditions of professional work and learning. Professional expertise has become increasingly specialised, and practitioners are challenged to keep up with rapid developments in their fields. At the same time, the complexity of professional work requires the integration of different forms of knowledge and knowing. Against this background, the knowledge settings in which learners engage and the practices and resources these offer are of vital importance. This article addresses professional education as embedded in profession-specific ‘machineries of knowledge construction’, that is, the set of practices and arrangements through which knowledge and ways of knowing in a profession are generated. It is argued that such machineries span settings in education and work. Examples from research in three professional programmes are used to discuss how students are introduced to epistemic practices and resources in selected knowledge settings. Analytical attention is given to the dynamic interplay between people, practices, knowledge resources and educational arrangements as well as to how connections to work and the epistemic machinery are made. Taking these linkages into account is important for our understanding of what learning entails in different areas of expertise and how this may change over time.  相似文献   


This paper and the accompanying tool are intended to complement existing supports for conducting power analysis tools by offering a tool based on the framework of Minimum Detectable Effect Sizes (MDES) formulae that can be used in determining sample size requirements and in estimating minimum detectable effect sizes for a range of individual- and group-random assignment design studies and for common quasi-experimental design studies. The paper and accompanying tool cover computation of minimum detectable effect sizes under the following study designs: individual random assignment designs, hierarchical random assignment designs (2-4 levels), block random assignment designs (2-4 levels), regression discontinuity designs (6 types), and short interrupted time-series designs. In each case, the discussion and accompanying tool consider the key factors associated with statistical power and minimum detectable effect sizes, including the level at which treatment occurs and the statistical models (e.g., fixed effect and random effect) used in the analysis. The tool also includes a module that estimates for one and two level random assignment design studies the minimum sample sizes required in order for studies to attain user-defined minimum detectable effect sizes.  相似文献   

The primary purpose of the present paper is to call attention to Hawkins’ (4) procedure for testing a sequence of observations for a shift in location. Such a procedure could have applicability for assessing change within a single subject. Monte Carlo results are provided which suggest that Hawkins’ procedure is robust with respect to moderate violations of its underlying assumptions of homogeneity of variance and normality. A hypothetical example is also discussed for illustrative purposes.  相似文献   

正态性检验方法在教学研究中的应用   总被引:1,自引:0,他引:1  
针对目前很多研究者在进行正态性检验时仅会依据自己的习惯或喜好来选择方法这一状况,文章从常用方法中选取Jarque-Bera检验、Shapiro-Wilk检验、D'Agostino检验、KolmogorovSmirnov检验以及Lilliefors检验这五种正态性检验方法进行简要论述,利用Monte Carlo法分析比较五种检验在不同样本量的不同分布下的检验功效或Ⅰ型错误率,再结合SAS、SPSS和R这三种常用的教学统计软件,讨论正态性检验方法的选取问题,以期为科研工作者选择正态性检验方法时提供参考。  相似文献   


Gender norms and learned practices of student teachers can influence their performance in practice, either fixing or challenging, gendered social norms and expectations. This paper shares the findings of a multi-year mixed-methods research project that explored the understandings of gender norms and experiences of students and staff within a large teacher-training college in Tanzania. Data was collected to inform a wider gender mainstreaming initiative across the institution. Using a blend of quantitative and qualitative methods, the findings identified a strict and rigid gender binary which seemed to inform attitudes and practices of teaching and learning. Furthermore, it uncovered heterogeneous forms of gendered domination that were experienced by staff and pupils within the institution. The findings suggest that stand-alone ‘female only’ gender mainstreaming strategies may not be sufficient to achieve a gender equitable environment within the institution. Rather, it suggests that a whole-of-community approach is necessary to unravel deep-rooted biases and to tackle diverse forms of domination that affect different members of the college community in different ways. Such findings are particularly important in light of the epistemic power that is conferred on teacher-graduates and that is transferred through teaching practices to communities across Tanzania.  相似文献   

The study, using a Monte Carlo technique, was designed to investigate the effect of the differences in covariate means among the treatment groups on the significance level and the power of the F-test of the analysis of covariance. The results show that the covariate group means differences have little effect on the significance level if the covariate is highly correlated with the criterion variable. However, if the correlation is .4 or less, larger sample sizes are required. The effect on the power is more sensitive for smaller experiments. The larger the differences among covariate group means, the lower the actual power becomes compared to the approximate theoretical power.  相似文献   


The article addresses the implications of Prevent and Channel for epistemic justice. The first section outlines the background of Prevent. It draws upon Moira Gatens and Genevieve Lloyd’s concept of the collective imaginary, alongside Lorraine Code’s concept of epistemologies of mastery, in order to outline some of the images and imaginaries that inform and orient contemporary counter-terrorist preventative initiatives, in particular those affecting education. Of interest here is the way in which vulnerability (to radicalisation) is conceptualised in Prevent and Channel, in particular the way in which those deemed ‘at risk of radicalisation’ are constituted as vulnerable and requiring intervention. The imaginary underpinning such preventative initiatives is, I argue, a therapeutic/epidemiological one. If attention is paid to the language associated with these interventions, one finds reference to terms such as contagion, immunity, resilience, grooming, virus, susceptibility, therapy, autonomy, vulnerability and risk—a constellation of images/concepts resonant with therapeutic and epidemiological theories and practices. I outline some of the implications of this therapeutic/epidemiological imaginary for epistemic injustice. If people, in this case, students, teachers and parents, feel that their voice will not be given credence, this leads to testimonial injustice. If one group is constituted as a suspect community, this risks hermeneutical injustice for that group—a situation facing Muslims at present. Given the requirements for educators and educational institutions to enact this particular iteration of preventative counter-terrorist legislation, the way in which vulnerability (to radicalisation) is understood and operationalised has direct bearing upon education and the educational experience of all stakeholders, in particular in relation to the conditions for epistemic justice.  相似文献   

基于Logistic响应模型,在二元响应数据下,将待估计的响应刺激量变换为模型的一个参数,应用鞍点逼近方法给出了该响应刺激量估计条件分布的高阶近似公式.在此基础上,引入Fiducial模型,并应用Fiducial模型给出了响应刺激量的区间估计,通过蒙特卡罗数值模拟表明,在样本量较小时,给合Fiducial模型,应用鞍点逼近方法,能够较好地估计响应刺激量.  相似文献   


Este estudio analiza el modo en que la situación de la familia española actual influye en la educación en valores de los hijos. La muestra está formada por 3.711 padres y madres de niños con edades comprendidas entre los 2 y los 12 años, de centros públicos y privados de 6 ciudades españolas. Se utilizó un cuestionario para evaluar en los padres y madres las variables consideradas relevantes para la educación en valores cívico-morales de los hijos.

Del análisis factorial, combinando las estrategias exploratoria y confirmatoria, se obtuvieron 11 factores. Para conocer qué factores explican la intervención moral de padres y madres se realizaron correlaciones y sendos análisis discriminantes. En ambos se obtuvo que el afecto y la aceptación incondicional de los hijos son las variables con mayor poder predictivo de la intervención moral familiar.  相似文献   

The interrelationship between senior high school students’ science achievement (SA) and their self‐confidence and interest in science (SCIS) was explored with a representative sample of approximately 1,044 11th‐grade students from 30 classes attending four high schools throughout Taiwan. Statistical analyses indicated that a statistically significant correlation existed between students’ SA and their SCIS with a moderate effect size; the correlation is even higher with almost large effect sizes for a subsample of higher‐SCIS and lower‐SCIS students. Results of t‐test analysis also revealed that there were significant mean differences in students’ SA and their knowledge (including physics, chemistry, biology, and earth sciences subscales) and reasoning skill subtests scores between higher‐SCIS and lower‐SCIS students, with generally large effect sizes. Stepwise regression analyses on higher‐SCIS and lower‐SCIS students also suggested that both students’ SCIS subscales significantly explain the variance of their SA, knowledge, and reasoning ability with large effect sizes.  相似文献   

When the assumption of multivariate normality is violated and the sample sizes are relatively small, existing test statistics such as the likelihood ratio statistic and Satorra–Bentler’s rescaled and adjusted statistics often fail to provide reliable assessment of overall model fit. This article proposes four new corrected statistics, aiming for better model evaluation with nonnormally distributed data at small sample sizes. A Monte Carlo study is conducted to compare the performances of the four corrected statistics against those of existing statistics regarding Type I error rate. Results show that the performances of the four new statistics are relatively stable compared with those of existing statistics. In particular, Type I error rates of a new statistic are close to the nominal level across all sample sizes under a condition of asymptotic robustness. Other new statistics also exhibit improved Type I error control, especially with nonnormally distributed data at small sample sizes.  相似文献   


Using a “naïve” specification, this paper estimates the relationship between 36 high school characteristics and 24 student outcomes controlling for students' pre-high school characteristics. The goal of this exploration is not to generate casual estimates, but rather to: (a) compare the size of the relationships to determine which inputs seem most promising and to identify which student outcomes appear most susceptible to being affected; (b) obtain likely upper-bound effect sizes that are useful information for power analyses used to establish minimum sample sizes for more robust designs capable of revealing causal impacts; and (c) illustrate how small effects over many outcomes (which are cumulatively important) can be easily missed. I find that most of the 36 inputs appear to have affected more outcomes than one would expect by chance, but that the apparent effects were generally small. Further, I find a higher frequency of large and significant apparent effects on educational achievement and attainment outcomes than labor market and other outcomes for young adults.  相似文献   

This paper presents the results of a simulation study to compare the performance of the Mann-Whitney U test, Student?s t test, and the alternate (separate variance) t test for two mutually independent random samples from normal distributions, with both one-tailed and two-tailed alternatives. The estimated probability of a Type I error was controlled (in the sense of being reasonably close to the attainable level) by all three tests when the variances were equal, regardless of the sample sizes. However, it was controlled only by the alternate t test for unequal variances with unequal sample sizes. With equal sample sizes, the probability was controlled by all three tests regardless of the variances. When it was controlled, we also compared the power of these tests and found very little difference. This means that very little power will be lost if the Mann-Whitney U test is used instead of tests that require the assumption of normal distributions.  相似文献   

