Causal inference often relies on the counterfactual framework, which requires that treatment assignment is independent of the outcome, known as strong ignorability. Approaches to enforcing strong ignorability in causal analyses of observational data include weighting and matching methods. Effect estimates, such as the average treatment effect (ATE), are then estimated as expectations under the reweighted or matched distribution, P . The choice of P is important and can impact the interpretation of the effect estimate and the variance of effect estimates. In this work, instead of specifying P, we learn a distribution that simultaneously maximizes coverage and minimizes variance of ATE estimates. In order to learn this distribution, this research proposes a generative adversarial network (GAN)-based model called the Counterfactual $\chi$-GAN (cGAN), which also learns feature-balancing weights and supports unbiased causal estimation in the absence of unobserved confounding. Our model minimizes the Pearson $\chi^2$ divergence, which we show simultaneously maximizes coverage and minimizes the variance of importance sampling estimates. To our knowledge, this is the first such application of the Pearson $\chi^2$ divergence. We demonstrate the effectiveness of cGAN in achieving feature balance relative to established weighting methods in simulation and with real-world medical data.
翻译:因果关系推论往往依赖于反事实框架,即治疗任务必须独立于结果之外,即所谓的严重忽略。在观察数据的因果分析中强制实施强烈忽视的方法包括加权和匹配方法。效果估计,如平均治疗效果(ATE),然后作为重算或匹配分布下的预期估计,P. 选择P很重要,并可能影响对影响估计效应和估计效应差异的解释。在这项工作中,我们学会了一种分配方式,这种分配方式可以尽量扩大覆盖范围,并尽量减少ATE估计值的差异。为了了解这种分布,这项研究提出了一种基于基因对抗网络的模型,称为反事实$\chi$-GAN(GAN),该模型还学习了地平权加权值,支持在没有未观察到的粘结的情况下进行不偏不倚的因果关系估计。我们的模型将Pearson $\chi%2美元的差异降到最低,我们同时展示了最大限度的覆盖范围,并将重要估计值差异降到最低。我们了解的是,在模型中首次应用了以基因对抗网络为基础的网络模式,从而实现了Pearson $ QQQ 。