Recent advances in probabilistic generative modeling have motivated learning Structural Causal Models (SCM) from observational datasets using deep conditional generative models, also known as Deep Structural Causal Models (DSCM). If successful, DSCMs can be utilized for causal estimation tasks, e.g., for answering counterfactual queries. In this work, we warn practitioners about non-identifiability of counterfactual inference from observational data, even in the absence of unobserved confounding and assuming known causal structure. We prove counterfactual identifiability of monotonic generation mechanisms with single dimensional exogenous variables. For general generation mechanisms with multi-dimensional exogenous variables, we provide an impossibility result for counterfactual identifiability, motivating the need for parametric assumptions. As a practical approach, we propose a method for estimating worst-case errors of learned DSCMs' counterfactual predictions. The size of this error can be an essential metric for deciding whether or not DSCMs are a viable approach for counterfactual inference in a specific problem setting. In evaluation, our method confirms negligible counterfactual errors for an identifiable SCM from prior work, and also provides informative error bounds on counterfactual errors for a non-identifiable synthetic SCM.
翻译:在概率模型方面,最近的进展激励了利用深层有条件基因模型(又称深层结构构造模型(DSCM))从观测数据集中学习结构构造模型(SCM)的动力。如果成功,DSCMS可以用于因果估计任务,例如回答反事实询问。在这项工作中,我们警告实践者,即使没有观测到的混凝土和假设已知因果结构,从观测数据中得出反事实推论也是无法识别的。我们证明,单维外源变量单体生成机制的反事实可辨性。对于具有多维外源变量的一般生成机制,我们为反事实识别性提供了不可能的结果,从而激发了对参数假设的需要。作为实际做法,我们建议了一种方法,用以估计从观测数据中了解到的DSCMS反事实预测中最坏的错误。这一错误的规模可以成为确定在设定特定问题时,单维外源变量的单立体生成机制是否为反事实推论的可行方法。在评估中,我们的方法为反事实性识别性误提供了一种可辨测的、可辨测的合成误。