Inferring causal relationships from observational data is a fundamental yet highly complex problem when the number of variables is large. Recent advances have made much progress in learning causal structure models (SEMs) but still face challenges in scalability. This paper aims to efficiently discover causal DAGs from high-dimensional data. We investigate a way of recovering causal DAGs from inverse covariance estimators of the observational data. The proposed algorithm, called ICID (inverse covariance estimation and {\it independence-based} decomposition), searches for a decomposition of the inverse covariance matrix that preserves its nonzero patterns. This algorithm benefits from properties of positive definite matrices supported on {\it chordal} graphs and the preservation of nonzero patterns in their Cholesky decomposition; we find exact mirroring between the support-preserving property and the independence-preserving property of our decomposition method, which explains its effectiveness in identifying causal structures from the data distribution. We show that the proposed algorithm recovers causal DAGs with a complexity of $O(d^2)$ in the context of sparse SEMs. The advantageously low complexity is reflected by good scalability of our algorithm in thorough experiments and comparisons with state-of-the-art algorithms.
翻译:在变量数量巨大的情况下,从观测数据中推断因果关系是一个根本性的、但高度复杂的问题。最近的进展在学习因果结构模型(SEM)方面取得了很大进展,但在可变性方面仍然面临着挑战。本文件的目的是从高维数据中有效地发现因果数据包。我们调查从观测数据反常共差估计器中回收因果数据包的方法。拟议的算法称为IDID(逆共差估计和基于独立独立的分解),寻求对保留其非零模式的反正共变矩阵的分解。这种算法的好处是,得到 ~chordal} 图表支持的正肯定矩阵的特性和Cholesky 分解性中保存非零模式的特性。我们发现支持-保留财产和我们分解法的独立保留属性之间的精确反射镜,这解释了它从数据分布中查明因果结构的有效性。我们提议的算法在数据分配中回收了因果数据包件包件,显示,在深度SO(d_2)美元的变数分析中,从我们缺乏的变法的精确性变法分析中可以反映。