以线性因果模型对因果发现的信心 (Confidence in Causal Discovery with Linear Causal Models)

Structural causal models postulate noisy functional relations among a set of interacting variables. The causal structure underlying each such model is naturally represented by a directed graph whose edges indicate for each variable which other variables it causally depends upon. Under a number of different model assumptions, it has been shown that this causal graph and, thus also, causal effects are identifiable from mere observational data. For these models, practical algorithms have been devised to learn the graph. Moreover, when the graph is known, standard techniques may be used to give estimates and confidence intervals for causal effects. We argue, however, that a two-step method that first learns a graph and then treats the graph as known yields confidence intervals that are overly optimistic and can drastically fail to account for the uncertain causal structure. To address this issue we lay out a framework based on test inversion that allows us to give confidence regions for total causal effects that capture both sources of uncertainty: causal structure and numerical size of nonzero effects. Our ideas are developed in the context of bivariate linear causal models with homoscedastic errors, but as we exemplify they are generalizable to larger systems as well as other settings such as, in particular, linear non-Gaussian models.

翻译：在一系列不同的模型假设下,已经证明这一因果图表以及由此而产生的因果效应可以从纯粹的观察数据中识别出来。对于这些模型,已经设计了实用的算法来学习图。此外,当图表为人所知时,标准技术可以用来提供因果关系的估计和信任间隔。然而,我们争论说,一种两步方法,首先学习一个图表,然后将图表作为已知的产值信心间隔处理,然后以已知的产值间隔处理,这种间隔过于乐观,并可能严重地无法说明不确定的因果结构。为了解决这一问题,我们根据测试性转换制定了一个框架,使我们能够对总因果效应提供信心区域,既能捕捉到不确定性的来源:因果结构和非零效应的数值大小。我们的想法是在具有同性误差的双变量线性因果模型背景下发展出来的,但是我们把它们作为一般化的系统,作为非线性G型模型,作为其他环境,特别是非线性G型模型。