改善观测研究可解释性的巴耶斯波间相互校准 (Bayesian Posterior Interval Calibration to Improve the Interpretability of Observational Studies)

Observational healthcare data offer the potential to estimate causal effects of medical products on a large scale. However, the confidence intervals and p-values produced by observational studies only account for random error and fail to account for systematic error. As a consequence, operating characteristics such as confidence interval coverage and Type I error rates often deviate sharply from their nominal values and render interpretation impossible. While there is longstanding awareness of systematic error in observational studies, analytic approaches to empirically account for systematic error are relatively new. Several authors have proposed approaches using negative controls (also known as "falsification hypotheses") and positive controls. The basic idea is to adjust confidence intervals and p-values in light of the bias (if any) detected in the analyses of the negative and positive control. In this work, we propose a Bayesian statistical procedure for posterior interval calibration that uses negative and positive controls. We show that the posterior interval calibration procedure restores nominal characteristics, such as 95% coverage of the true effect size by the 95% posterior interval.

翻译：观察保健数据提供了大规模估计医疗产品因果效应的潜力。然而,观察研究产生的置信间隔和p值只考虑到随机错误,而没有考虑到系统错误。因此,信任间隔覆盖率和I型误差率等操作特征往往与名义值大相径庭,使得解释不可能。虽然观察研究中长期意识到系统性错误,但分析系统错误的经验分析方法相对比较新。一些作者提出了使用负控制(又称“假假假设”)和积极控制的方法。基本想法是,根据对负和正控制分析中发现的偏差(如果有的话)调整信任间隔和p值。在这项工作中,我们建议采用巴伊西亚统计程序对后方间误差校准,使用负和正面控制。我们表明后方间校准程序恢复了名义特征,例如95%的后方误差间隔区间对实际影响大小的95%的覆盖率。