Using deep latent variable models in causal inference has attracted considerable interest recently, but an essential open question is their ability to yield consistent causal estimates. While they have demonstrated promising results and theory exists on some simple model formulations, we also know that causal effects are not even identifiable in general with latent variables. We investigate this gap between theory and empirical results with analytical considerations and extensive experiments under multiple synthetic and real-world data sets, using the causal effect variational autoencoder (CEVAE) as a case study. While CEVAE seems to work reliably under some simple scenarios, it does not estimate the causal effect correctly with a misspecified latent variable or a complex data distribution, as opposed to its original motivation. Hence, our results show that more attention should be paid to ensuring the correctness of causal estimates with deep latent variable models.
翻译:在因果关系推断中使用深潜变量模型最近引起了相当大的兴趣,但一个基本的未决问题是,它们是否有能力得出一致的因果关系估计。虽然这些模型显示了有希望的结果,而且在某些简单的模型配方上存在着理论,但我们也知道,总体而言,因果关系与潜在变量甚至无法辨别。我们利用多种合成和现实世界数据集的分析考虑和广泛实验,用因果变异自动编码器(CEVAE)作为案例研究,对理论与实证结果之间的这一差距进行了调查。虽然CEVAE在某些简单假设下似乎可以可靠地工作,但它并没有正确估计因果影响,其潜在变量或复杂的数据分布与最初的动机相反。因此,我们的结果表明,应当更加注意确保以深潜在变异模型来准确估计因果关系。