Many real-world decision-making tasks require learning causal relationships between a set of variables. Typical causal discovery methods, however, require that all variables are observed, which might not be realistic in practice. Unfortunately, in the presence of latent confounding, recovering causal relationships from observational data without making additional assumptions is an ill-posed problem. Fortunately, in practice, additional structure among the confounders can be expected, one such example being pervasive confounding, which has been exploited for consistent causal estimation in the special case of linear causal models. In this paper, we provide a proof and consistent method to estimate causal relationships in the non-linear, pervasive confounding setting. The heart of our procedure relies on the ability to estimate the confounding variation through a simple spectral decomposition of the observed data matrix. We derive a DAG score function based on this insight, prove its consistency in recovering a correct ordering of the DAG, and empirically compare it to existing procedures. We show improved performance on both simulated and real datasets by explicitly accounting for both confounders and non-linear effects.
翻译:然而,典型的因果发现方法要求观察所有变量,这在实践中可能不切实际。 不幸的是,在存在潜在的混淆的情况下,从观察数据中恢复因果关系而不做额外的假设是一个不恰当的问题。 幸运的是,在实践中,人们可以预期,混淆者之间会有更多的结构,其中一个例子是普遍存在的混杂,在线性因果模型的特殊案例中,这种结构被用来进行一致的因果估计。在本文中,我们提供了一个证据和一致的方法,用以估计非线性、普遍混杂环境中的因果关系。我们程序的核心取决于能否通过观察的数据矩阵的简单光谱分解来估计混杂的变化。我们根据这一洞察力得出DAG的评分功能,证明它在恢复对DAG的正确排序方面的一致性,并用经验将其与现有程序进行比较。我们通过明确计算相近和非线性效应,在模拟和真实数据集上的表现都有所改善。