Causal discovery from data affected by unobserved variables is an important but difficult problem to solve. The effects that unobserved variables have on the relationships between observed variables are more complex in nonlinear cases than in linear cases. In this study, we focus on causal additive models in the presence of unobserved variables. Causal additive models exhibit structural equations that are additive in the variables and error terms. We take into account the presence of not only unobserved common causes but also unobserved intermediate variables. Our theoretical results show that, when the causal relationships are nonlinear and there are unobserved variables, it is not possible to identify all the causal relationships between observed variables through regression and independence tests. However, our theoretical results also show that it is possible to avoid incorrect inferences. We propose a method to identify all the causal relationships that are theoretically possible to identify without being biased by unobserved variables. The empirical results using artificial data and simulated functional magnetic resonance imaging (fMRI) data show that our method effectively infers causal structures in the presence of unobserved variables.
翻译:从受到未观测的变量影响的数据中得出的因果发现是一个重要但困难的问题。未观测的变量对已观测的变量之间的关系的影响在非线性案例中比在线性案例中更为复杂。在本研究中,我们侧重于在未观测的变量情况下的因果添加模型。因果添加模型展示了在变量和误差术语中添加的结构方程式。我们考虑到不仅存在未观测的共同原因,而且存在未观测的中间变量。我们的理论结果表明,当因果关系是非线性且存在未观测的变量时,不可能通过回归和独立测试查明所观测的变量之间的所有因果关系。然而,我们的理论结果还表明,有可能避免不正确的推断。我们提出一种方法,在理论上可以确定在不受到未观测的变量偏差的情况下所有因果关系。我们用人工数据和模拟功能磁共振成像(fMRI)的实验结果显示,我们的方法有效地推断出在未观测的变量存在时的因果关系结构。