In cluster randomized trials, patients are recruited after clusters are randomized, and the recruiters and patients may not be blinded to the assignment. This often leads to differential recruitment and systematic differences in baseline characteristics of the recruited patients between intervention and control arms, inducing post-randomization selection bias. We aim to rigorously define causal estimands in the presence of selection bias. We elucidate the conditions under which standard covariate adjustment methods can validly estimate these estimands. We further discuss the additional data and assumptions necessary for estimating causal effects when such conditions are not met. Adopting the principal stratification framework in causal inference, we clarify there are two average treatment effect (ATE) estimands in cluster randomized trials: one for the overall population and one for the recruited population. We derive the analytical formula of the two estimands in terms of principal-stratum-specific causal effects. Further, using simulation studies, we assess the empirical performance of the multivariable regression adjustment method under different data generating processes leading to selection bias. When treatment effects are heterogeneous across principal strata, the ATE on the overall population generally differs from the ATE on the recruited population. A naive intention-to-treat analysis of the recruited sample leads to biased estimates of both ATEs. In the presence of post-randomization selection and without additional data on the non-recruited subjects, the ATE on the recruited population is estimable only when the treatment effects are homogenous between principal strata, and the ATE on the overall population is generally not estimable. The extent to which covariate adjustment can remove selection bias depends on the degree of effect heterogeneity across principal strata.
翻译:在集束随机试验中,病人是在随机随机分类后招聘的,征聘者和病人可能不会盲目地接受分配,这往往导致征聘者在干预和控制武器之间有不同的征聘,而且被征聘者在基线特征上存在系统差异,导致随机选择后出现偏差。我们的目标是严格界定因果估计值,同时存在选择偏差。我们阐明了标准共变调整方法能够有效估计这些估计值的条件。我们进一步讨论了在不符合此类条件时估计因果关系所必需的额外数据和假设。采用主要因果分级框架,我们澄清了在集束随机测试中,所征聘的患者有两种平均治疗效果(ATE)估计值和基准值(ATE),我们从两个角度来严格界定因果估计因果估计值;我们利用模拟研究,我们评估在产生选择偏差的不同数据的过程中,多变数回归调整法方法的经验性能导致选择偏差。当处理结果在主层之间出现差异时,在整体人口中,在非因果程度上,总体人口估计值为估计值(ATE),一般而言,从总的人口估计值值调整后,从征聘的标值调整为ATE值后,一般而言,征聘后算算算算的计算结果会改变为ATE值后,通常为ATE值的计算。