Causal discovery from observational data is an important but challenging task in many scientific fields. Recently, a method with non-combinatorial directed acyclic constraint, called NOTEARS, formulates the causal structure learning problem as a continuous optimization problem using least-square loss. Though the least-square loss function is well justified under the standard Gaussian noise assumption, it is limited if the assumption does not hold. In this work, we theoretically show that the violation of the Gaussian noise assumption will hinder the causal direction identification, making the causal orientation fully determined by the causal strength as well as the variances of noises in the linear case and by the strong non-Gaussian noises in the nonlinear case. Consequently, we propose a more general entropy-based loss that is theoretically consistent with the likelihood score under any noise distribution. We run extensive empirical evaluations on both synthetic data and real-world data to validate the effectiveness of the proposed method and show that our method achieves the best in Structure Hamming Distance, False Discovery Rate, and True Positive Rate matrices.
翻译:从观测数据中发现因果是许多科学领域一项重要但具有挑战性的任务。最近,一种使用非聚合性定向循环约束的方法,称为OnteARS, 将因果结构学习问题作为一种持续优化问题,使用最低平方损失计算。虽然根据高斯噪音的标准假设,最小平方损失功能是完全合理的,但如果假设不成立,则这一假设是有限的。在这项工作中,我们理论上表明,违反高斯噪音假设将阻碍因果方向识别,使因果取向完全取决于因果强度以及线性案件和非线性案件中的强烈非高加索噪音的差异。因此,我们提出了在理论上与任何噪音分布下的可能性得分相一致的更为普遍的基于酶损失。我们对合成数据和真实世界数据进行了广泛的经验评价,以证实拟议方法的有效性,并表明我们的方法在结构成形距离、虚变异率和真实阳性速率矩阵中达到了最佳的因果取方向。