首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract

A two-cutoff regression discontinuity design (RDD) was used to assign 321 students in grades 1 through 6 at a Title I elementary school to two types of Tier 2 reading interventions administered by paraeducators: (a) direct instruction (DI) and (b) computer-assisted instruction (CAI). Students scoring at or below a lower cutoff pretest score were assigned to the DI reading intervention. Students scoring between the lower cutoff score and an upper cutoff score on the pretest were assigned to a CAI reading intervention. Student reading ability was reassessed in January and May. Results indicated that the DI intervention was significantly more effective than the CAI interventions at the lower cutoff (p < .01). No significant treatment effect was detected at the upper cutoff, but the estimation power of the design at this cutoff was limited to medium-to-large effect size. Findings suggest that the DI intervention was superior to the CAI interventions for at-risk readers. Implications for practice, including fidelity of paraeducator implementation, are discussed.  相似文献   

2.
This study examines the impact of grade retention on behavioral outcomes under a comprehensive assessment-based student promotion policy in New York City. To isolate the causal effect of grade retention, we implement a fuzzy regression discontinuity (RD) design that exploits the fact that grade retention is largely determined by whether a student scores below a cutoff on a standardized test score. We use data on students subject to the policy over a nine-year span to examine impacts on attendance and disciplinary event outcomes. We do not find evidence of systematic effects of retention on behavioral outcomes in either direction. We do find sporadic nonsustained significant effects of retention on behavioral outcomes. When present, these isolated nonpersistent effects tend to be beneficial when found for retained elementary school students and mixed for retained middle school students.  相似文献   

3.
Abstract

This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of treatment for a population and how different types of RD analyses—“sharp” versus “fuzzy”—can identify average treatment effects for different conceptual subpopulations. Part 3 of the article introduces graphical methods, parametric statistical methods, and nonparametric statistical methods for estimating treatment effects in practice from regression discontinuity data plus validation tests and robustness tests for assessing these estimates. Section 4 considers generalizing RD findings and presents several different views on and approaches to the issue. Part 5 notes some important issues to pursue in future research about or applications of RD analysis.  相似文献   

4.
Value-added (VA) measures are currently the predominant approach used to compare the effectiveness of schools. Recent educational effectiveness research, however, has developed alternative approaches including the regression discontinuity (RD) design, which also allows estimation of absolute school effects. Initial research suggests RD is a viable approach to measuring school effectiveness. The present study builds on this pioneering work by using RD and VA designs to estimate school effects at system and school level, comparing estimates from several measurement designs. The study uses a large English dataset (N = 148,135) spanning 342 schools, 10 local authorities, 6 consecutive school year groups (UK Years 3–9) across 3 years. RD is found to be a suitable approach for system-level absolute school effect estimates. Cross-sectional and longitudinal measures are found to lead to markedly different estimates when comparing individual schools. The results also reinforce the need to treat measures based on a single cohort with extreme caution.  相似文献   

5.
Abstract

We aimed to compare the findings of three research designs to bracket effect estimates of a strongly worded warning letter delivered by certified mail to students on academic probation.

We embedded an experiment within a regression discontinuity design and calculated two achievement estimates, average GPA and percentage of students remaining on probation. Study participants attended a large Midwestern college. Cohen's d experimental effect size was .45. Regression discontinuity design results were validated by our experimental evidence, and outcome measures were generally statistically significant. We provided additional supportive evidence using comparative RD control group design logic. Regression point displacement design results were successfully replicated using a within-study comparison inside the experiment. In the context of probation, a diverse design, replicative approach provided considerable promise for more precise estimation of intervention effectiveness. We found no deleterious impact on reenrollment and concluded that the certified letter represents an inexpensive probation policy.  相似文献   

6.
As part of the American Recovery and Reinvestment Act of 2009, the School Improvement Grants (SIG) program received more than $3 billion with the goal of substantially improving student achievement. The SIG program’s funding was to help states turn around the nation’s persistently lowest-achieving schools using one of four school intervention models—transformation, turnaround, restart, or closure. We used a regression discontinuity design and a large sample of schools from more than 20 states to evaluate the impact of implementing a SIG-funded intervention model on use of SIG-promoted practices and student outcomes. Our regression discontinuity design exploited cutoff values on the continuous variables used to define the SIG eligibility tiers to compare outcomes in schools that just met the eligibility cutoff for receiving SIG funding to outcomes in schools that just missed it. We found that SIG had no impact on any of the outcomes we examined, including math and reading test scores, high school graduation rates, and college enrollment rates. Using a correlational analysis, we found the turnaround model was associated with larger student achievement gains in math than the transformation model for grades 6 through 12.  相似文献   

7.
In 2006, North Carolina put in place high school exit standards requiring students to pass a series of high-stakes exams across several years. I use a regression discontinuity (RD) approach to analyze whether passing or failing one of these exams (Algebra I) impacts a student's decision between choosing a more rigorous college-preparatory math curriculum and an easier ‘career’ track math curriculum. I find a 5 percentage point gap in the probability of selecting the rigorous curriculum between 9th grade students who just passed and those who just failed the exam. RD results across two years (one year in which the graduation standards were not in place) suggest that the discontinuity arose due to fewer students opting into the college track as a result of the exam results.  相似文献   

8.
Abstract

Teachers used a nonexclusionary time‐out package to treat inappropriate social behaviors that interfered with instructional tasks in four classrooms of preschool children, each containing an average of 16 students. We yoked classrooms on pretreatment scores of target behaviors and randomly assigned them to a time‐out ribbon plus praise or control condition. Students in the treatment condition wore wristbands with attached smiling face stickers and received praise from teachers for taking turns talking and remaining seated during group activities. Contingent on disruptive behavior, teachers removed the wristbands signaling a three‐minute time‐out period in which teachers restricted praise and attention but ongoing participation in activities continued. Multiple measures of treatment outcome were analyzed via MANCOVA and ANCOVAs. Behavior improved on direct observations of target behavior, the Preschool Behavior Questionnaire (PBQ) and items on a semantic differential, but not the Matson Evaluation of Social Skills with Youngsters. Treatment acceptability also supported treatment effects. We discuss implications of these results.  相似文献   

9.
Abstract

Expanded instructional time has become increasingly popular as a strategy to improve the academic outcomes of low-skilled students, particularly in the 9th grade. We evaluate the efficacy of a double-period algebra policy initiated in the Chicago Public Schools in 2003. This policy required all students with 8th-grade test scores below the national median to enroll in a support algebra course in addition to regular algebra in the 9th grade. Using regression discontinuity combined with interrupted time series designs, and instrumental variable models, we show the effects of the policy on students' grades, failure rates and test scores in 9th-grade algebra and 10th-grade geometry. Providing support courses improved algebra test scores for the target population but only modestly affected grades and failure rates. Students with very low initial abilities benefited less than students close to the national median. The policy also led schools to track algebra classes by students' entering math skills. As a result, it affected academic outcomes among students not targeted by the policy; test scores among high-ability students improved whereas their grades declined.  相似文献   

10.
Abstract

This paper and the accompanying tool are intended to complement existing supports for conducting power analysis tools by offering a tool based on the framework of Minimum Detectable Effect Sizes (MDES) formulae that can be used in determining sample size requirements and in estimating minimum detectable effect sizes for a range of individual- and group-random assignment design studies and for common quasi-experimental design studies. The paper and accompanying tool cover computation of minimum detectable effect sizes under the following study designs: individual random assignment designs, hierarchical random assignment designs (2-4 levels), block random assignment designs (2-4 levels), regression discontinuity designs (6 types), and short interrupted time-series designs. In each case, the discussion and accompanying tool consider the key factors associated with statistical power and minimum detectable effect sizes, including the level at which treatment occurs and the statistical models (e.g., fixed effect and random effect) used in the analysis. The tool also includes a module that estimates for one and two level random assignment design studies the minimum sample sizes required in order for studies to attain user-defined minimum detectable effect sizes.  相似文献   

11.
Abstract

This study uses a regression discontinuity design to evaluate the impact of seventh-grade interventions delivered by 25 schools on the reading and school engagement outcomes of struggling readers. Students in participating schools were assigned to intervention (n?=?1,495) or comparison (n?=?4,397) conditions, based on their state reading performance and reading fluency scores in sixth grade. On the state reading test, intervention impact favoring the treatment group approached, but did not reach, statistical significance (p?=?.056, g?=?.15). On a measure of reading fluency, there was a significant interaction between the cut points schools chose for condition assignment and intervention impact (p?=?.015). In schools that chose a lower cut point, intervention students outperformed comparison students, while in schools that chose a higher cut point, comparison students outperformed intervention students. On a student self-report measure of school engagement, no significant differences between the intervention and comparison groups were observed (p?=?.83, g?=?.02). Implications for rigorous evaluation of interventions implemented by districts and schools under naturalistic conditions are discussed.  相似文献   

12.
Researchers have explored a variety of topics related to identifying and distinguishing among specific types of rater effects, as well as the implications of different types of incomplete data collection designs for rater‐mediated assessments. In this study, we used simulated data to examine the sensitivity of latent trait model indicators of three rater effects (leniency, central tendency, and severity) in combination with different types of incomplete rating designs (systematic links, anchor performances, and spiral). We used the rating scale model and the partial credit model to calculate rater location estimates, standard errors of rater estimates, model–data fit statistics, and the standard deviation of rating scale category thresholds as indicators of rater effects and we explored the sensitivity of these indicators to rater effects under different conditions. Our results suggest that it is possible to detect rater effects when each of the three types of rating designs is used. However, there are differences in the sensitivity of each indicator related to type of rater effect, type of rating design, and the overall proportion of effect raters. We discuss implications for research and practice related to rater‐mediated assessments.  相似文献   

13.
Peer assessment can be conducted online with rapid development of online learning technology. The current study was conducted empirically to investigate peer rating accuracy and student learning outcomes in online peer assessments, comparing compulsory and voluntary peer assessment. Section 1 (N?=?93) was assigned to the voluntary group and Section 2 (N?=?31) was assigned to the compulsory group. The results showed the voluntary group scored significantly higher than the compulsory group in the final task of the course, while there was no significant difference on the final task score increase. Students who participated in the voluntary group provided more accurate scores (i.e. peer rater accuracy) than those who participated in the compulsory group. The peer score leniency/severity rating, comparing peer assigned scores with the teacher assigned scores, were generally consistent with the peer rater accuracy results. The current study offers insights for researchers who are interested in studying the effect of online peer assessment activities. The results are also of interest for instructors who may want to conduct peer assessments in online courses and are choosing between compulsory and voluntary formats.  相似文献   

14.
ABSTRACT

In the “individually randomized group treatment” (IRGT) experimental design, individuals are first randomly assigned to a treatment arm or a control arm, but then within each arm, are grouped together (e.g., within classrooms/schools, through shared case managers, in group therapy sessions, through shared doctors, etc.) to receive services. In the common setting where the second-stage grouping is not experimentally controlled, we demonstrate that the standard error estimators for the estimated treatment effects from both random and fixed effects models are biased because parameters required to compute the standard errors are not identifiable from the data. We discuss implications for designing and analyzing IRGTs and provide suggestions for mitigating the bias.  相似文献   

15.
I study the impact of remedial training for low-performing teachers in Chile. Taking advantage of the fact that assignment to remediation is mainly based on teacher evaluation scores, I use a fuzzy regression discontinuity design and find that teachers barely assigned to remediation improve their pedagogical practices as measured by their next evaluation scores. While there is suggestive evidence that these teachers’ students obtain higher standardized test scores after the training is complete, this result is not robust, and the suggestive positive impact disappears after one year. I also find that during the year of their teacher’s reevaluation, the students of teachers assigned to remedial training obtain significantly lower test scores. Teachers assigned to remediation report lower prestige and job satisfaction, suggesting that the stigma of being labeled as a low performer leads teachers to put more effort into preparing their teaching evaluations, causing a temporary drop in student learning.  相似文献   

16.
Abstract

The authors examined the effects of positive interdependence vs. no interdependence on students' academic achievement. Participants included 151 U.S. college students who took weekly electronic quizzes on which they could interact with group mates in a chat room. In the positive interdependence condition, 1 member was chosen at random, and his or her score was given to all members of the group. In the no-interdependence condition, each group member received his or her own score on each quiz. Achievement was measured by biweekly examinations that students took by themselves and that produced their own independent scores. Students in the positive-interdependence condition engaged in significantly more interaction and more promotive interaction while taking the electronic quizzes and achieved higher scores on the subsequent examinations taken individually.  相似文献   

17.
The present study investigated the hypothesis that the higher prevalence of reading disability (RD) often observed among boys is partly an artifact of gender bias in the prediction of reading from IQ. The relevant regression statistics derived from a sample of more than 900 children revealed a statistically significant intercept bias. Predicted reading scores for boys were systematically overestimated, thereby inflating IQ-reading discrepancies; the converse was found for girls. When defined separately for girls and boys, severe underachievement in reading was found to be equally prevalent in both genders and, furthermore, was associated with qualitatively and quantitatively similar patterns of deficits. Because the bias arose from general differences between boys and girls in reading score distributions (a lower mean and greater variance for boys) rather than from differences in IQ scores, gender bias poses a potential threat not only to traditional IQ-discrepancy definitions but also to post-discrepancy definitions that are based solely on reading score cutoffs. Future classification criteria for RD need to take heed of the possibility that when the distributions of reading scores for boys and girls are not identical, performance cutoffs designating low achievement that are based on data pooled from both genders are likely to result in the overidentification of boys with RD and the underidentification of girls with RD.  相似文献   

18.
Abstract

Experiments that involve nested structures often assign entire groups (such as schools) to treatment conditions. Key aspects of the design of such experiments include knowledge of the intraclass correlation structure and the sample sizes necessary to achieve adequate power to detect the treatment effect. This study provides methods for computing power in three-level cluster randomized balanced designs (with two levels of nesting), where, for example, students are nested within classrooms and classrooms are nested within schools and schools are assigned to treatments. The power computations take into account nesting effects at the second (classroom) and at the third (school) level, sample size effects (e.g., number of schools, classrooms, and individuals), and covariate effects (e.g., pretreatment measures). The methods are applicable to quasi-experimental studies that examine group differences in an outcome.  相似文献   

19.
Research by Huck and McLean (1975) demonstrated that the covariance‐adjusted score is more powerful than the simple difference score, yet recent reviews indicate researchers are equally likely to use either score type in two‐wave randomized experimental designs. A Monte Carlo simulation was conducted to examine the conditions under which the simple difference and covariance‐adjusted scores were more or less powerful to detect treatment effects when relaxing certain assumptions made by Huck and McLean (1975) . Four factors were manipulated in the design including sample size, normality of the pretest and posttest distributions, the correlation between pretest and posttest, and posttest variance. A 5 × 5 × 4 × 3 mostly crossed design was run with 1,000 replications per condition, resulting in 226,000 unique samples. The gain score was nearly as powerful as the covariance‐adjusted score when pretest and posttest variances were equal, and as powerful in fan‐spread growth conditions; thus, under certain circumstances the gain score could be used in two‐wave randomized experimental designs.  相似文献   

20.
ABSTRACT

This follow-up study aimed at a better understanding of the associations of reading disability (RD) with adult-age psychosocial well-being. We compared adult-age psychosocial well-being in 48 individuals (20–39 years) with documented childhood RD but without comorbid disabilities to 37 matched controls. The associations of psychosocial well-being with childhood and adult-age reading fluency were studied in the RD group, controlling for IQ, gender and unemployment. Psychosocial well-being was assessed with commonly used self-report questionnaires. No group differences were found in psychosocial well-being. In the RD group, lower adult-age reading fluency was associated with symptoms of depression, lower self-esteem, and social functioning. Severity of childhood RD was not associated with psychosocial well-being. Thus, reading fluency problems continuing into adulthood appear to be related to adult-age psychosocial well-being. A more holistic approach to studying how RD impacts adult-age well-being is needed, including both the individual’s developmental history and current functioning in various domains.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号