Covariate adjustment is an approach to improve the precision of trial analyses by adjusting for baseline variables that are prognostic of the primary endpoint. Motivated by the SEARCH Universal HIV Test-and-Treat Trial (2013-2017), we tell our story of developing, evaluating, and implementing a machine learning-based approach for covariate adjustment. We provide the rationale for as well as the practical concerns with such an approach for estimating marginal effects. Using schematics, we illustrate our procedure: targeted machine learning estimation (TMLE) with Adaptive Pre-specification. Briefly, sample-splitting is used to data-adaptively select the combination of estimators of the outcome regression (i.e., the conditional expectation of the outcome given the trial arm and covariates) and known propensity score (i.e., the conditional probability of being randomized to the intervention given the covariates) that minimizes the cross-validated variance estimate and, thereby, maximizes empirical efficiency. We discuss our approach for evaluating finite sample performance with parametric and plasmode simulations, pre-specifying the Statistical Analysis Plan, and unblinding in real-time on video conference with our colleagues from around the world. We present the results from applying our approach in the primary, pre-specified analysis of 8 recently published trials (2022-2024). We conclude with practical recommendations and an invitation to implement our approach in the primary analysis of your next trial.
翻译:协变量调整是一种通过调整与主要终点预后相关的基线变量来提高试验分析精度的方法。基于SEARCH通用HIV检测与治疗试验(2013-2017)的实践,我们阐述了开发、评估和实施基于机器学习的协变量调整方法的完整历程。我们阐释了该方法用于估计边际效应的理论基础及实际考量,并通过示意图展示了我们的流程:采用自适应预指定策略的靶向机器学习估计(TMLE)。简言之,该方法利用样本分割技术,数据自适应地选择结果回归(即给定试验组别和协变量的结果条件期望)与已知倾向得分(即给定协变量下被随机分配至干预组的条件概率)的估计量组合,以最小化交叉验证方差估计值,从而实现经验效率最大化。我们讨论了通过参数化与拟态模拟评估有限样本性能的方法、统计分析计划的预指定流程,以及与全球同事通过视频会议实时揭盲的实践。我们展示了在8项近期发表试验(2022-2024)的预设主要分析中应用该方法的结果。最后,我们提出实践建议,并邀请研究者在未来试验的主要分析中实施本方法。