首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract

Field experiments that involve nested structures frequently assign treatment conditions to entire groups (such as schools). A key aspect of the design of such experiments includes knowledge of the clustering effects that are often expressed via intraclass correlation. This study provides methods for constructing a more powerful test for the treatment effect in three-level cluster randomized designs with two levels of nesting (at the second and third levels). When the intraclass correlation structure at the second and third level is assumed to be known, the proposed test provides higher estimates of power than those obtained from the typical test based on level-3 unit means, because it preserves the degrees of freedom associated with the number of level-2 and level-1 units. The advantage in power estimates is more pronounced when the number of level-3 units (e.g., schools) is small and the samples are homogeneous (e.g., low-achieving schools).  相似文献   

2.
Abstract

This paper and the accompanying tool are intended to complement existing supports for conducting power analysis tools by offering a tool based on the framework of Minimum Detectable Effect Sizes (MDES) formulae that can be used in determining sample size requirements and in estimating minimum detectable effect sizes for a range of individual- and group-random assignment design studies and for common quasi-experimental design studies. The paper and accompanying tool cover computation of minimum detectable effect sizes under the following study designs: individual random assignment designs, hierarchical random assignment designs (2-4 levels), block random assignment designs (2-4 levels), regression discontinuity designs (6 types), and short interrupted time-series designs. In each case, the discussion and accompanying tool consider the key factors associated with statistical power and minimum detectable effect sizes, including the level at which treatment occurs and the statistical models (e.g., fixed effect and random effect) used in the analysis. The tool also includes a module that estimates for one and two level random assignment design studies the minimum sample sizes required in order for studies to attain user-defined minimum detectable effect sizes.  相似文献   

3.
Abstract

Experiments that involve nested structures may assign treatment conditions either to subgroups (such as classrooms) or individuals within subgroups (such as students). The design of such experiments requires knowledge of the intraclass correlation structure to compute the sample sizes necessary to achieve adequate power to detect the treatment effect. This study provides methods for computing power in three-level block randomized balanced designs (with two levels of nesting) where, for example, students are nested within classrooms and classrooms are nested within schools. The power computations take into account nesting effects at the second (classroom) and at the third (school) level, sample size effects (e.g., number of level-1, level-2, and level-3 units), and covariate effects (e.g., pretreatment measures). The methods are generalizable to quasi-experimental studies that examine group differences on an outcome.  相似文献   

4.
This article proposes a novel exploratory approach for assessing how the effects of Level-2 predictors differ across Level-1 units. Multilevel regression mixture models are used to identify latent classes at Level 1 that differ in the effect of 1 or more Level-2 predictors. Monte Carlo simulations are used to demonstrate the approach with different sample sizes and to demonstrate the consequences of constraining 1 of the random effects to 0. An application of the method to evaluate heterogeneity in the effects of classroom practices on students is used to show the types of research questions that can be answered with this method and the issues faced when estimating multilevel regression mixtures.  相似文献   

5.
Abstract

Experiments that involve nested structures often assign entire groups (such as schools) to treatment conditions. Key aspects of the design of such experiments include knowledge of the intraclass correlation structure and the sample sizes necessary to achieve adequate power to detect the treatment effect. This study provides methods for computing power in three-level cluster randomized balanced designs (with two levels of nesting), where, for example, students are nested within classrooms and classrooms are nested within schools and schools are assigned to treatments. The power computations take into account nesting effects at the second (classroom) and at the third (school) level, sample size effects (e.g., number of schools, classrooms, and individuals), and covariate effects (e.g., pretreatment measures). The methods are applicable to quasi-experimental studies that examine group differences in an outcome.  相似文献   

6.
The purpose of this study was to examine the impact of misspecifying a growth mixture model (GMM) by assuming that Level-1 residual variances are constant across classes, when they do, in fact, vary in each subpopulation. Misspecification produced bias in the within-class growth trajectories and variance components, and estimates were substantially less precise than those obtained from a correctly specified GMM. Bias and precision became worse as the ratio of the largest to smallest Level-1 residual variances increased, class proportions became more disparate, and the number of class-specific residual variances in the population increased. Although the Level-1 residuals are typically of little substantive interest, these results suggest that researchers should carefully estimate and report these parameters in published GMM applications.  相似文献   

7.
The present article considers a fundamental question in evaluation research: “By how much do program effects vary across sites?” The article first presents a theoretical model of cross-site impact variation and a related estimation model with a random treatment coefficient and fixed site-specific intercepts. This approach eliminates several biases that can arise from unbalanced sample designs for multisite randomized trials. The article then describes how the approach operates, explores its assumptions, and applies the approach to data from three large welfare-to-work trials. The article also illustrates how to report cross-site impact findings and presents diagnostics for assessing these findings. To keep the article manageable, it focuses on experimental estimates of effects of program assignment (effects of intent to treat), although the ideas presented can be extended to analyses of multisite quasi-experiments and experimental estimates of effects of program participation (complier average causal effects).  相似文献   

8.
Ignoring a level can have a substantial impact on the conclusions of a multilevel analysis. For intercept-only models and for balanced data, we derive these effects analytically. For more complex random intercept models or for unbalanced data, a simulation study is performed. Most important effects concern estimates and corresponding standard errors of the variance parameters at adjacent levels and of the coefficients of the predictors at the ignored and bordering levels. Therefore, we conclude that if the researcher is interested in a specific level, she/he should account for both the upper and lower level. Conclusions are illustrated using empirical data from educational research.  相似文献   

9.
The purpose of this study was to investigate the methods of estimating the reliability of school-level scores using generalizability theory and multilevel models. Two approaches, ‘student within schools’ and ‘students within schools and subject areas,’ were conceptualized and implemented in this study. Four methods resulting from the combination of these two approaches with generalizability theory and multilevel models were compared for both balanced and unbalanced data. The generalizability theory and multilevel models for the ‘students within schools’ approach produced the same variance components and reliability estimates for the balanced data, while failing to do so for the unbalanced data. The different results from the two models can be explained by the fact that they administer different procedures in estimating the variance components used, in turn, to estimate reliability. Among the estimation methods investigated in this study, the generalizability theory model with the ‘students nested within schools crossed with subject areas’ design produced the lowest reliability estimates. Fully nested designs such as (students:schools) or (subject areas:students:schools) would not have any significant impact on reliability estimates of school-level scores. Both methods provide very similar reliability estimates of school-level scores.  相似文献   

10.
The authors used Monte Carlo methods to examine the Type I error rates for randomization tests applied to single-case data arising from ABAB designs involving random, systematic, or response-guided assignment of interventions. Six randomization tests were examined (permuting blocks of 1, 2, 3, or 5 observations, and randomly selecting intervention triplets so that each phase has at least 3 or 5 observations). When the design included randomization, the Type I error rate was controlled. When the design was systematic or guided by the absolute value of the slope, the tests permuting blocks tended to be liberal with positive autocorrelation, whereas those based on the random selection of intervention triplets tended to be conservative across levels of autocorrelation.  相似文献   

11.
Contemporary educational accountability systems, including state‐level systems prescribed under No Child Left Behind as well as those envisioned under the “Race to the Top” comprehensive assessment competition, rely on school‐level summaries of student test scores. The precision of these score summaries is almost always evaluated using models that ignore the classroom‐level clustering of students within schools. This paper reports balanced and unbalanced generalizability analyses investigating the consequences of ignoring variation at the level of classrooms within schools when analyzing the reliability of such school‐level accountability measures. Results show that the reliability of school means cannot be determined accurately when classroom‐level effects are ignored. Failure to take between‐classroom variance into account biases generalizability (G) coefficient estimates downward and standard errors (SEs) upward if classroom‐level effects are regarded as fixed, and biases G‐coefficient estimates upward and SEs downward if they are regarded as random. These biases become more severe as the difference between the school‐level intraclass correlation (ICC) and the class‐level ICC increases. School‐accountability systems should be designed so that classroom (or teacher) level variation can be taken into consideration when quantifying the precision of school rankings, and statistical models for school mean score reliability should incorporate this information.  相似文献   

12.
Single subject (SS) designs are popular in educational and psychological research. There exist several statistical techniques designed to analyze such data and to address the question of whether an intervention has the desired impact. Recently, researchers have suggested that generalized additive models (GAMs) might be useful for modeling nonlinear effects that are common with SS designs. This study sought to extend the use of GAM from SS to a research design in which individuals may be placed in separate groups and receive different interventions. Results of the simulation study found that using a mixed model form of GAM (GAMM) resulted in higher power for detecting actual effects in the population than was true for either GAM or a Bayesian GAM estimator. Thus, GAMMs are recommended for use with SS designs when interventions are expected to induce nonlinear relationships between time and the outcome variable and individuals receive different treatments.  相似文献   

13.
Statistical power was estimated for 3 randomization tests used with multiple-baseline designs. In 1 test, participants were randomly assigned to baseline conditions; in the 2nd, intervention points were randomly assigned; and in the 3rd, the authors used both forms of random assignment. Power was studied for several series lengths (N = 10, 20, 30), several effect sizes (d = 0, 0.5, 1.0, 1.5, 2.0), and several levels of autocorrelation among the errors (p 1 = 0, .1, .2, .3, .4, and .5). Power was found to be similar among the 3 tests. Power was low for effect sizes of 0.5 and 1.0 but was often adequate (> .80) for effect sizes of 1.5 and 2.0.  相似文献   

14.
When good model-data fit is observed, the Many-Facet Rasch (MFR) model acts as a linking and equating model that can be used to estimate student achievement, item difficulties, and rater severity on the same linear continuum. Given sufficient connectivity among the facets, the MFR model provides estimates of student achievement that are equated to control for differences in rater severity. Although several different linking designs are used in practice to establish connectivity, the implications of design differences have not been fully explored. Research is also limited related to the impact of model-data fit on the quality of MFR model-based adjustments for rater severity. This study explores the effects of linking designs and model-data fit for raters on the interpretation of student achievement estimates within the context of performance assessments in music. Results indicate that performances cannot be effectively adjusted for rater effects when inadequate linking or model-data fit is present.  相似文献   

15.
The authors sought to identify through Monte Carlo simulations those conditions for which analysis of covariance (ANCOVA) does not maintain adequate Type I error rates and power. The conditions that were manipulated included assumptions of normality and variance homogeneity, sample size, number of treatment groups, and strength of the covariate-dependent variable relationship. Alternative tests studied were Quade's procedure, Puri and Sen's solution, Burnett and Barr's rank difference scores, Conover and Iman's rank transformation test, Hettmansperger's procedure, and the Puri-Sen-Harwell-Serlin test. For balanced designs, the ANCOVA F test was robust and was often the most powerful test through all sample-size designs and distributional configurations. With unbalanced designs, with variance heterogeneity, and when the largest treatment-group variance was matched with the largest group sample size, the nonparametric alternatives generally outperformed the ANCOVA test. When sample size and variance ratio were inversely coupled, all tests became very liberal; no test maintained adequate control over Type I error.  相似文献   

16.
《Educational Assessment》2013,18(2):105-123
Achievement data from a longitudinally matched student cohort from a large school district in the southwestern United States were analyzed to investigate sample exclusion and student attrition effects on estimates of student, school, and district mathematics performance. Use of 2- and 3-level longitudinal growth models to estimate the growth trajectories of middle school students revealed that mathematics performance differed across 2 sample conditions. Relative to the achievement outcomes associated with a sample that included all students from the longitudinal cohort, district and school achievement were generally higher and student group performance more similar in the smaller, more advantaged student sample used for district accountability reporting. Further investigation of the school performance estimates showed that cross-sample changes in student achievement outcomes were closely related to the proportion of students from special student populations who were excluded from the district accountability sample. The achievement differences and the differential patterns of association demonstrated in this study suggest that conclusions drawn about district and school performance and relationships between student characteristics and student achievement outcomes may depend to some degree on which students are included in an analytic sample. Investigators seeking to take advantage of longitudinal designs in school effectiveness research are cautioned to closely examine their data for nonrandom student attrition and document the impact of sample exclusion and student attrition effects in the research and accountability reports that are produced from longitudinal data sets.  相似文献   

17.
When data for multiple outcomes are collected in a multilevel design, researchers can select a univariate or multivariate analysis to examine group-mean differences. When correlated outcomes are incomplete, a multivariate multilevel model (MVMM) may provide greater power than univariate multilevel models (MLMs). For a two-group multilevel design with two correlated outcomes, a simulation study was conducted to compare the performance of MVMM to MLMs. The results showed that MVMM and MLM performed similarly when data were complete or missing completely at random. However, when outcome data were missing at random, MVMM continued to provide unbiased estimates, whereas MLM produced grossly biased estimates and severely inflated Type I error rates. As such, this study provides further support for using MVMM rather than univariate analyses, particularly when outcome data are incomplete.  相似文献   

18.
定义了一种基于正交相遇平衡区组设计(或者广义正交表)的统计模型,给出了模型参数的最小二乘估计,进行了因素水平效应的研究,最后,给出了广义正交表的应用实例.  相似文献   

19.
An important concern when planning research studies is to obtain maximum precision of an estimate of a treatment effect given a budget constraint. When research designs have a multilevel or hierarchical structure changes in sample size at different levels of the design will impact precision differently. Furthermore, there will typically be differential costs of enrolling additional units at different levels of the hierarchy. The optimal design problem in multilevel research studies involves determining the optimal sample size at each level of the design given specified design parameters and a specified marginal cost of recruitment at each level. The current work extends existing results by considering optimal design for (a) unbalanced random assignment designs and (b) regression discontinuity designs.  相似文献   

20.
定义了一种基于正交相遇平衡区组设计(或者广义正交表)的统计模型,给出了二次型统计量及数据的平方和分解,在此基础上进行了方差分析研究,最后,给出了广义正交表的应用实例.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号