Causal discovery from i.i.d. observational data is known to be generally ill-posed. We demonstrate that if we have access to the distribution of a structural causal model, and additional data from only two environments that sufficiently differ in the noise statistics, the unique causal graph is identifiable. Notably, this is the first result in the literature that guarantees the entire causal graph recovery with a constant number of environments and arbitrary nonlinear mechanisms. Our only constraint is the Gaussianity of the noise terms; however, we propose potential ways to relax this requirement. Of interest on its own, we expand on the well-known duality between independent component analysis (ICA) and causal discovery; recent advancements have shown that nonlinear ICA can be solved from multiple environments, at least as many as the number of sources: we show that the same can be achieved for causal discovery while having access to much less auxiliary information.
翻译:众所周知,从独立同分布的观测数据中进行因果发现通常是病态问题。我们证明,如果能够获取结构因果模型的分布,并仅从两个在噪声统计特性上充分不同的环境中获得额外数据,则唯一的因果图是可识别的。值得注意的是,这是文献中首个保证在恒定数量环境及任意非线性机制下实现完整因果图恢复的结果。我们唯一的约束是噪声项的高斯性;然而,我们提出了可能放宽此要求的方法。作为独立的研究价值,我们拓展了独立成分分析与因果发现之间著名的对偶关系:最新进展表明非线性ICA可通过至少与源数量相当的多环境数据求解;我们证明因果发现可在获取更少辅助信息的情况下实现相同目标。