Heterogeneous treatment effects (HTE) based on patients' genetic or clinical factors are of significant interest to precision medicine. Simultaneously modeling HTE and corresponding main effects for randomized clinical trials with high-dimensional predictive markers is challenging. Motivated by the modified covariates approach, we propose a two-stage statistical learning procedure for estimating HTE with optimal efficiency augmentation, generalizing to arbitrary interaction model and exploiting powerful extreme gradient boosting trees (XGBoost). Target estimands for HTE are defined in the scale of mean difference for quantitative outcomes, or risk ratio for binary outcomes, which are the minimizers of specialized loss functions. The first stage is to estimate the main-effect equivalency of the baseline markers on the outcome, which is then used as an augmentation term in the second stage estimation for HTE. The proposed two-stage procedure is robust to model mis-specification of main effects and improves efficiency for estimating HTE through nonparametric function estimation, e.g., XGBoost. A permutation test is proposed for global assessment of evidence for HTE. An analysis of a genetic study in Prostate Cancer Prevention Trial led by the SWOG Cancer Research Network, is conducted to showcase the properties and the utilities of the two-stage method.
翻译:根据病人的遗传或临床因素,基于病人遗传或临床因素的异变治疗效应(HTE)对精密医学具有重大意义。同时,HTE模型和具有高维预测标志的随机临床试验的相应主要影响具有挑战性。根据经修改的共变办法,我们提议了一个两阶段统计学习程序,以最佳效率增高来估计HTE,将之推广为任意互动模式,并利用强大的极端梯度助推树(XGBoost)。HTE的目标估计值在数量结果或二元结果风险比的平均值差异尺度中界定,后者是专门损失功能的最小值。第一阶段是估计结果基准标志的主要效果等同性,然后在HTE第二阶段估计时用作增量术语。拟议的两阶段程序能够模拟主要效果的错误区分,并通过非参数估计来提高估计HTE的效率,例如,XGBoost。为HTE专门损失功能的最小值或二元结果风险比值的全球评估提出了一个对结果的透度测试。第一阶段是评估结果基准标值的主要效果,然后在HTE的第二阶段用作HTE的扩展期估算。拟议的两阶段的癌症学研究所研究所进行的一项研究。