تجزیه و تحلیل حساسیت برای خوشه داده ها: یک تصویر از یک کارآزمایی تصادفی شده خوشه ای در مقیاس بزرگ در آموزش و پرورش
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|27272||2014||10 صفحه PDF||سفارش دهید||9485 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Evaluation and Program Planning, Available online 12 July 2014
In this paper, we demonstrate the importance of conducting well-thought-out sensitivity analyses for handling clustered data (data in which individuals are grouped into higher order units, such as students in schools) that arise from cluster randomized controlled trials (RCTs). This is particularly relevant given the rise in rigorous impact evaluations that use cluster randomized designs across various fields including education, public health and social welfare. Using data from a recently completed cluster RCT of a school-based teacher professional development program, we demonstrate our use of four commonly applied methods for analyzing clustered data. These methods include: (1) Hierarchical Linear Modeling (HLM); (2) Feasible Generalized Least Squares (FGLS); (3) Generalized Estimating Equations (GEE); and (4) Ordinary Least Squares (OLS) regression with cluster-robust (Huber-White) standard errors. We compare our findings across each method, showing how inconsistent results—in terms of both effect sizes and statistical significance—emerged across each method and our analytic approach to resolving such inconsistencies.
Cluster randomized controlled trials1 (RCTs) have become an increasingly popular way to evaluate the impact of interventions which are applicable to intact groups of individuals. Common examples include schools that are randomly assigned to offer its students an educational intervention. Similarly, there are studies in which clinics are randomized to offer a particular treatment to an intact group of patients it serves. One notable feature of such trials is that individuals (e.g., students or patients) are clustered together in higher level units (e.g., schools or clinics) with the higher level unit serving as the unit of randomization. 2 Evaluators who analyze data from clustered RCTs must select from a variety of methods that appropriately account for the correlation between study participants within the higher level units. Ignoring such correlation, especially when the correlation between individuals within clusters is relatively high (as captured by the intra-class correlation coefficient (ICC) may lead to erroneous inferences due to downward biased standard errors (Garson, 2012, Hox, 2010, Liang and Zeger, 1993 and Zyzanski et al., 2004). For evaluation analysts, deciding upon which method to use when analyzing clustered data is not an exact science. Often, the choice depends upon a combination of factors including analysts’ professional judgment and their prior quantitative training. Also, the choice of method is driven by the methodological conventions and traditions of the disciplinary field (e.g., public health, education, etc.) in which the evaluation is conducted. However, one overarching principal is that analysts are entrusted to choose the most appropriate approach among various data analytic methods, prior to conducting analyses, based on their prior assessment of the design and data limitations. This prevents researchers from selecting, or being suspected of selecting, a particular analytic method to influence the results. Yet, when accounting for clustering, analysts often rely only upon one preferred methodological approach without considering how and if the results remain consistent across different methods. Carrying out analyses using different methods and checking for the consistency in results across such methods is one class of a broader set of sensitivity analyses ( Thabane et al., 2013) which analysts often undertake. We believe that well-thought-out sensitivity analyses to handle clustered data and the transparent reporting of such analyses are important, particularly as different methods can and—as we show in our case—lead to discrepant findings. When conflicting findings emerge across different methodological approaches, we believe that evaluation analysts must then proceed to understand the conflicting results, plan alternate analyses to reconcile such findings, and carefully document those alternative approaches. Finally, analysts should be transparent in communicating their analytic decisions to their evaluation audience. In this paper, we review our results from a recently completed cluster randomized trial of a teacher professional development program. We compare our results across four methods we used to account for clustering in our data: (1) Hierarchical Linear Modeling (HLM); (2) Feasible Generalized Least Squares (FGLS); (3) Generalized Estimating Equations (GEE); and (4) Ordinary Least Squares (OLS) regression with robust clustered (Huber-White) standard errors. Importantly, we show how inconsistent results emerged across these different methods and our approach to resolving inconsistencies. We present and discuss our work primarily from an applied point of view, forgoing technical descriptions of the methods we have employed (with the exception of the statistical model we present for our main analytic approach using HLM). We do assume, however, that readers have basic familiarity with statistical concepts and the analytic issues that arise due to clustered data. We structure the rest of our paper in five sections. In Section 2, we briefly review clustered randomized controlled trials and introduce the concept of the intra-cluster correlation coefficient (ICC). The ICC is a key quantitative measure capturing the extent to which individuals are correlated within an intact group. We also discuss sensitivity analyses for clustered data, methods for handling clustered data and prior empirical studies that have compared methods for clustered data. Next, in Section 3, we describe our research design, providing background about our study intervention, the site and sample as well as our data and measures. In Section 4, we describe our primary analytic method along with our selected alternative methods. Then, in Section 5, we present results from the four analytic approaches we used to analyze our data, discussing the inconsistencies that emerged across the methods and ways in which we reconciled those inconsistent results. Finally, in Section 6, we close with several substantive “lessoned learned” of our work, providing advice to evaluation analysts who face the task of analyzing clustered data.
نتیجه گیری انگلیسی
Our work highlights the importance of verifying the results of cluster RCTs particularly when data are clustered and are subject to heterogeneity across sites. Though well-known data analytic methods are commonly used when analyzing data from cluster RCTs in education (e.g., HLM); however, more often than not, there is not a single “correct” estimation method, and analytic decisions depend primarily on the judgment of researchers. It is plausible that these decisions are made based solely upon the preferences of the researchers. In this case, thorough sensitivity analyses are particularly critical in order to verify the results. In addition to the main lesson we learned about the value of verifying our results across alternative methods for handling clustered data, we learned several additional lessons: 1. Methodological Bridging. Often times conducting sensitivity analyses that incorporate different methodological approaches requires what we call methodological bridging. Particular methods are discipline-specific, so it is important to look broadly at other disciplinary areas to understand how they handle similar methodological issues. Not only can methods to handle similar issues—such as clustering—differ, but the methodological terminology can vary as well, so it is important to build bridges with analysts who are trained in evaluation but are grounded in different disciplines ranging from economics, statistics and, more broadly, the social sciences (e.g., public health, education and public policy). 2. Selection of Analytic Methods for Clustering. Based on our review of studies that compare ways to handle clustered data as well as our own empirical findings, we conclude that there is no one “right” way to handle clustering. Beyond our basic recommendation that analysts should account for clustering when analyzing data from clustered RCTs, we also advise analysts to carefully consider the tradeoffs in analyzing clustered data. There are important practical considerations (e.g., the number of levels in the data) and distributional assumptions of the data. 3. Weighted Average Approaches. If there are multiple sites in a study, analysts may want to consider estimating program effects separately for each site as an exploratory step. In our case, this was helpful because it revealed cross-site heterogeneity. If there are differences in effects across sites, as we had discovered in our study, one option analysts may want to consider is using a weighted average approach as we have described above. 4. Transparent Reporting of Methods and Results. Finally, and most importantly, we highly recommend that analysts clearly develop a priori a plan for analyzing clustered data as part of their study protocols, including a description of the alternative approaches they will undertake. Also, analysts should clearly document and report their analytic methods and findings across methods. Doing so ensures that analysts will carry out their analytic work both thoughtfully and responsibly, preserving the overall integrity and rigor of the study. As we mentioned, in our study, we were concerned that deviating from our a priori specified data analytic plan would impact the face validity of the study; as such, we clearly documented the procedures and results of our analytic decisions in order to be fully transparent about how we obtained our final impact findings. Our paper's contribution to the extant evaluation literature is to raise the awareness of the different methodological approaches to handle clustered data, the need to verify results across methods and—importantly—documenting and supplying information on the data analytic decision process if results of those sensitivity analyses are inconsistent. In sum, we believe that analysts should strive to become much more transparent and rigorous in their use, discussion and reporting of sensitivity analyses for clustered data arising from cluster RCTs. Doing so can greatly enhance the credibility and robustness of findings from impact evaluations that rely on cluster RCTs.