Most existing causal structure learning methods require data to be independent and identically distributed (i.i.d.), which often cannot be guaranteed when the data come from different environments. Some previous efforts try to tackle this problem in two independent stages, i.e., first discovering i.i.d. clusters from non-i.i.d. samples, then learning the causal structures from different groups. This straightforward solution ignores the intrinsic connections between the two stages, that is both the clustering stage and the learning stage should be guided by the same causal mechanism. Towards this end, we propose a unified Causal Cluster Structures Learning (named CCSL) method for causal discovery from non-i.i.d. data. This method simultaneously integrates the following two tasks: 1) clustering subjects with the same causal mechanism; 2) learning causal structures from the samples of subjects. Specifically, for the former, we provide a Causality-related Chinese Restaurant Process to cluster samples based on the similarity of the causal structure; for the latter, we introduce a variational-inference-based approach to learn the causal structures. Theoretical results provide identification of the causal model and the clustering model under the linear non-Gaussian assumption. Experimental results on both simulated and real-world data further validate the correctness and effectiveness of the proposed method.
翻译:大多数现有的因果结构学习方法要求数据是独立和同样分布的(一.d.),当数据来自不同环境时往往无法保证这些数据。以前的一些努力试图在两个独立阶段解决这一问题,即首先从非因果机制样本中发现i.d类集,然后从不同群体中学习因果结构。这一直接的解决办法忽视了两个阶段(即集群阶段和学习阶段)之间的内在联系,这两个阶段都是由同一因果机制指导的。为此,我们建议采用统一的因果结构学习(CCCL)方法,从非因果发现非因果数据。这种方法同时将以下两个任务结合起来:1) 将主题与同一因果机制组合;2) 从主题样本中学习因果结构。具体地说,我们根据因果结构的相似性向组样本提供与因果关系有关的中国餐厅程序;对于后者,我们采用基于差异的推断方法,以学习因果结构,从非因果结构中得出统一的CSB.I.d.d.d.d.数据。该方法同时将以下两个任务结合起来:(1) 将主题与同一因果机制组合组合;(2) 从实验性模型和模拟模型的准确性。