评估不同偏见假设情景下经验校准的有效性 (Assessing the effectiveness of empirical calibration under different bias scenarios)

Background: Estimations of causal effects from observational data are subject to various sources of bias. These biases can be adjusted by using negative control outcomes not affected by the treatment. The empirical calibration procedure uses negative controls to calibrate p-values and both negative and positive controls to calibrate coverage of the 95% confidence interval of the outcome of interest. Although empirical calibration has been used in several large observational studies, there is no systematic examination of its effect under different bias scenarios. Methods: The effect of empirical calibration of confidence intervals was analyzed using simulated datasets with known treatment effects. The simulations were for binary treatment and binary outcome, with simulated biases resulting from unmeasured confounder, model misspecification, measurement error, and lack of positivity. The performance of empirical calibration was evaluated by determining the change of the confidence interval coverage and bias of the outcome of interest. Results: Empirical calibration increased coverage of the outcome of interest by the 95% confidence interval under most settings but was inconsistent in adjusting the bias of the outcome of interest. Empirical calibration was most effective when adjusting for unmeasured confounding bias. Suitable negative controls had a large impact on the adjustment made by empirical calibration, but small improvements in the coverage of the outcome of interest were also observable when using unsuitable negative controls. Conclusions: This work adds evidence to the efficacy of empirical calibration on calibrating the confidence intervals of treatment effects in observational studies. We recommend empirical calibration of confidence intervals, especially when there is a risk of unmeasured confounding.

翻译：实验校准程序使用负控法校准 p值以及负和正控法校准校准 95% 信任度间隔的覆盖率。虽然在几次大型观测研究中使用了经验校准方法,但在不同的偏差假设情景下没有系统地检查其影响。方法:用模拟的具有已知治疗效果的校准数据集对信任度间隔的经验校准效果进行了分析。模拟用于二进制处理和二进制结果,模拟的推荐结果来自非计量的折叠式、模型的偏差、测量错误和缺乏假设性。实验校准的性能通过确定信任度间隔范围的变化和利息结果的偏差来评估其覆盖面。结果:在大多数环境下95%信任度间隔对利息结果的覆盖面进行了系统校准,但在调整结果的偏差性方面,模拟的校准结果是负面的。在对不测的折误误率进行调整时,对不精确性效果的校准效果最为有效。在进行不精确的校准时,我们进行不精确的校准结果的校准结果是负面的。在进行不精确的校准的校准结果的校准时,对结果的校准结果也是负面的。