While learning the graphical structure of Bayesian networks from observational data is key to describing and helping understand data generating processes in complex applications, the task poses considerable challenges due to its computational complexity. The directed acyclic graph (DAG) representing a Bayesian network model is generally not identifiable from observational data, and a variety of methods exist to estimate its equivalence class instead. Under certain assumptions, the popular PC algorithm can consistently recover the correct equivalence class by testing for conditional independence (CI), starting from marginal independence relationships and progressively expanding the conditioning set. Here, we propose the dual PC algorithm, a novel scheme to carry out the CI tests within the PC algorithm by leveraging the inverse relationship between covariance and precision matrices. Notably, the elements of the precision matrix coincide with partial correlations for Gaussian data. Our algorithm then exploits block matrix inversions on the covariance and precision matrices to simultaneously perform tests on partial correlations of complementary (or dual) conditioning sets. The multiple CI tests of the dual PC algorithm, therefore, proceed by first considering marginal and full-order CI relationships and progressively moving to central-order ones. Simulation studies indicate that the dual PC algorithm outperforms the classical PC algorithm both in terms of run time and in recovering the underlying network structure.
翻译:虽然从观测数据中学习巴伊西亚网络的图形结构是描述和帮助理解复杂应用中的数据生成过程的关键,但这项任务因其计算的复杂性而构成相当大的挑战。代表巴伊西亚网络模型的定向环环图(DAG)一般无法从观测数据中识别,而且有各种方法来估计其等值等级。根据某些假设,流行的PC算法可以通过测试有条件独立(CI),从边际独立关系开始,逐步扩大调控组,不断恢复正确的等值等级。在这里,我们提出双重PC算法,这是一个在PC算法中进行CI测试的新办法,即利用常变和精确矩阵之间的反关系,在PC算法中进行CI测试。值得注意的是,精确矩阵的元素与高斯网络数据的部分相关性相吻合。我们的算法随后利用了组合矩阵的变换版和精确矩阵,同时进行互补(或双重)调控件组合部分相关性的测试。因此,我们首先考虑边际和全序CI关系,然后逐步进入中央测算系统。在CRAV中,模拟了双轨算法结构。