Learning the causal structure behind data is invaluable for improving generalization and obtaining high-quality explanations. We propose a novel framework, Invariant Structure Learning (ISL), that is designed to improve causal structure discovery by utilizing generalization as an indication. ISL splits the data into different environments, and learns a structure that is invariant to the target across different environments by imposing a consistency constraint. An aggregation mechanism then selects the optimal classifier based on a graph structure that reflects the causal mechanisms in the data more accurately compared to the structures learnt from individual environments. Furthermore, we extend ISL to a self-supervised learning setting where accurate causal structure discovery does not rely on any labels. This self-supervised ISL utilizes invariant causality proposals by iteratively setting different nodes as targets. On synthetic and real-world datasets, we demonstrate that ISL accurately discovers the causal structure, outperforms alternative methods, and yields superior generalization for datasets with significant distribution shifts.
翻译:学习数据背后的因果结构对于改进一般化和获得高质量解释非常宝贵。 我们提议了一个新颖的框架,即 " 差异结构学习 " (ISL),目的是通过使用一般化作为指示来改进因果结构发现; ISL将数据分割为不同的环境, 并学习一个在不同环境中与目标不相容的结构, 强制实行一致性限制。 集成机制然后根据图表结构选择最佳分类器, 图表结构反映数据中的因果机制, 与单个环境所学的结构相比更为准确。 此外, 我们将ISL推广到一个自我监督的学习环境, 使准确因果结构发现不依赖任何标签。 这个自我监督的ISL利用差异性因果关系建议, 反复设定不同的节点作为目标。 在合成和现实世界数据集上, 我们证明ISL准确发现了因果结构, 超越了其他方法, 并产生了分布发生重大变化的数据集的高级通用。