Augmenting the control arm of a randomized controlled trial (RCT) with external data may increase power at the risk of introducing bias. Existing data fusion estimators generally rely on stringent assumptions or may have decreased coverage or power in the presence of bias. Framing the problem as one of data-adaptive experiment selection, potential experiments include the RCT only or the RCT combined with different candidate real-world datasets. To select and analyze the experiment with the optimal bias-variance tradeoff, we develop a novel experiment-selector cross-validated targeted maximum likelihood estimator (ES-CVTMLE). The ES-CVTMLE uses two bias estimates: 1) a function of the difference in conditional mean outcome under control between the RCT and combined experiments and 2) an estimate of the average treatment effect on a negative control outcome (NCO). We define the asymptotic distribution of the ES-CVTMLE under varying magnitudes of bias and construct confidence intervals by Monte Carlo simulation. In simulations involving violations of identification assumptions, the ES-CVTMLE had better coverage than test-then-pool approaches and an NCO-based bias adjustment approach and higher power than one implementation of a Bayesian dynamic borrowing approach. We further demonstrate the ability of the ES-CVTMLE to distinguish biased from unbiased external controls through a re-analysis of the effect of liraglutide on glycemic control from the LEADER trial. The ES-CVTMLE has the potential to improve power while providing relatively robust inference for future hybrid RCT-RWD studies.
翻译:现有数据聚变估计值一般依赖严格的假设,或者在存在偏差的情况下可能缩小了覆盖范围或权力。将问题作为数据适应性实验的选择之一,潜在实验包括仅RCT或RCT,加上不同的候选真实世界数据集。为了选择和分析最佳偏差偏差取舍的实验,我们开发了一个新颖的实验选择器,对目标最大可能性估测器(ES-CVTMLE)进行交叉验证。ES-CVTMLE使用两种偏差估计:1)在RCT和合并实验之间控制下的有条件平均结果差异的函数,2)对负面控制结果的平均治疗效应的估计。我们定义了ES-CVTMLE在偏差程度不同的情况下的分布,并通过蒙特卡洛模拟建立信任间隔。在模拟中,涉及违反识别假设的模拟中,ES-CVTMLE的覆盖面高于测试性判断值,同时从测试性ACT和综合实验性判断力分析中,从测试性AES-C的弹性分析方法到测试性AFE-C的弹性分析能力,我们从测试-AFRE-C的弹性分析方法进一步改进了对AFE-L-AFAL-C的覆盖面的研究。