Discovering causal structures among latent factors from observed data is a particularly challenging problem. Despite some efforts for this problem, existing methods focus on the single-domain data only. In this paper, we propose Multi-Domain Linear Non-Gaussian Acyclic Models for Latent Factors (MD-LiNA), where the causal structure among latent factors of interest is shared for all domains, and we provide its identification results. The model enriches the causal representation for multi-domain data. We propose an integrated two-phase algorithm to estimate the model. In particular, we first locate the latent factors and estimate the factor loading matrix. Then to uncover the causal structure among shared latent factors of interest, we derive a score function based on the characterization of independence relations between external influences and the dependence relations between multi-domain latent factors and latent factors of interest. We show that the proposed method provides locally consistent estimators. Experimental results on both synthetic and real-world data demonstrate the efficacy and robustness of our approach.
翻译:从观察到的数据中发现潜在因素的因果结构是一个特别具有挑战性的问题。尽管为解决这一问题作出了一些努力,但现有方法只注重单一域数据。在本文件中,我们提议采用多域线性非加利尼西亚循环模型(MD-LiNA),其中所有领域都共享潜在利益因素之间的因果结构,我们提供其识别结果。该模型丰富了多域数据的因果代表性。我们建议采用一个综合的两阶段算法来估计模型。特别是,我们首先确定潜在因素,并估计要素装载矩阵。然后,为了发现共同的潜在利益因素之间的因果结构,我们根据外在影响之间独立关系的定性以及多域潜在因素和潜在利益因素之间的依赖关系,得出一个得分功能。我们表明,拟议的方法提供了地方一致的估算数据。合成数据和现实世界数据的实验结果显示了我们方法的功效和稳健性。