In epidemiology and social sciences, propensity score methods are popular for estimating treatment effects using observational data, and multiple imputation is popular for handling covariate missingness. However, how to appropriately use multiple imputation for propensity score analysis is not completely clear. This paper aims to bring clarity on the consistency (or lack thereof) of methods that have been proposed, focusing on the within approach (where the effect is estimated separately in each imputed dataset and then the multiple estimates are combined) and the across approach (where typically propensity scores are averaged across imputed datasets before being used for effect estimation). We show that the within method is valid and can be used with any causal effect estimator that is consistent in the full-data setting. Existing across methods are inconsistent, but a different across method that averages the inverse probability weights across imputed datasets is consistent for propensity score weighting. We also comment on methods that rely on imputing a function of the missing covariate rather than the covariate itself, including imputation of the propensity score and of the probability weight. Based on consistency results and practical flexibility, we recommend generally using the standard within method. Throughout, we provide intuition to make the results meaningful to the broad audience of applied researchers.
翻译:在流行病学和社会科学中,偏差评分法在利用观测数据估计治疗效果方面很受欢迎,在处理共差缺失时,多算法很受欢迎。然而,如何适当使用多重估算法来进行偏差评分分析并不完全清楚。本文件旨在澄清所提议的方法的一致性(或缺乏一致性),侧重于方法内部(在每项估算数据集中分别估计其影响,然后将多重估算综合在一起)和跨方法(通常在估算影响之前,在估算被估算数据集之间平均得出偏差分)。我们表明,方法内部是有效的,可以使用任何因果估测,在完整数据设置中是一致的。现有方法不一致,但不同的方法是平均估计被估算数据集的偏差权重,在度计分方面是一致的。我们还就以下方法发表了意见:在估算缺失的共差函数而不是相互差分本身,包括估计性评分法的误差率和广泛概率结果之前,我们可以使用这种方法,在完整数据设置中具有任何因果关系的估值。现有方法不一致,但在不同方法中,平均平均估计被估算出每个数据集的偏差权重数。我们用到高的精确度,根据我们所使用的推算方法,我们一般地推算得出了正确性。