Cluster-randomized experiments are increasingly used to evaluate interventions in routine practice conditions, and researchers often adopt model-based methods with covariate adjustment in the statistical analyses. However, the validity of model-based covariate adjustment is unclear when the working models are misspecified, leading to ambiguity of estimands and risk of bias. In this article, we first adapt two conventional model-based methods, generalized estimating equations and linear mixed models, with weighted g-computation to achieve robust inference for cluster-average and individual-average treatment effects. Furthermore, we propose an efficient estimator for each estimand that allows for flexible covariate adjustment and additionally addresses cluster size variation dependent on treatment assignment and other cluster characteristics. Such cluster size variations often occur post-randomization and, if ignored, can lead to bias of model-based estimators. For our proposed estimator, we prove that when the nuisance functions are consistently estimated by machine learning algorithms, the estimator is consistent, asymptotically normal, and efficient. When the nuisance functions are estimated via parametric working models, the estimator is triply-robust. Simulation studies and analyses of three real-world cluster-randomized experiments demonstrate that the proposed methods are superior to existing alternatives.
翻译:聚变实验越来越多地用于评价常规做法条件下的干预措施,研究人员往往采用基于模型的方法,在统计分析中进行共变调整;然而,在工作模型被错误地描述时,基于模型的共变调整的有效性并不明确,导致估计值的模糊性和偏差风险。在本条中,我们首先调整两种传统的基于模型的方法,即通用估计方程和线性混合模型,并采用加权G-计算法,以便对群集平均和个人平均治疗效果作出稳健的推断。此外,我们建议为每个估计值提供一个高效的估算器,以便进行灵活的共变差调整,并额外解决取决于治疗分配和其他组特性的群集体大小变化。这种群集规模变化往往发生在随机化之后,如果被忽视,可能导致基于模型的估计方位的偏差。对于我们提议的估算器,我们证明当通过机器学习算法对扰动功能进行一致的估算时,估测算器是一致的,是正常的和高效的。当模拟模拟模型和模拟式的模拟模型是模拟式的。