Clustering and dependence are common in trials. For example, in some cluster randomized trials (CRTs), pre-existing clusters are enrolled, randomized, and serve as the basis of intervention delivery. Such CRTs are "fully clustered": participants are dependent within clusters. In contrast, "partially clustered" trials contain a mix of participants that are dependent within clusters and participants that are completely independent. One example of this design is a trial where participants are artificially grouped together for the purposes of randomization only; then, for intervention participants, the groups are the basis for intervention delivery, while control participants are un-grouped. Another example is an individually randomized group treatment trial (IRGTT) where participants are individually randomized and, post-randomization, intervention participants are grouped for intervention delivery, while the control participants remain un-grouped. For the three trial designs, we use causal models to non-parametrically describe the data generating process and formalize the observed data dependence structure. We show that despite the different randomization approach, both designs can be represented with the same dependence structure, enabling the use of the same statistical methods for estimation and inference of causal effects. We propose a novel implementation of targeted minimum loss-based estimation (TMLE) for these trials. TMLE is model-robust, leverages covariate adjustment and machine learning, and estimates many causal effects. In simulations, TMLE achieved comparable higher statistical power than alternatives for partially clustered designs. Finally, application to real data from the SEARCH-IPT trial resulted in 20-57% efficiency gains, demonstrating the consequences of our proposed approach.
翻译:暂无翻译