Learning the graphical structure of Bayesian networks is key to describing data generating mechanisms in many complex applications but poses considerable computational challenges. Observational data can only identify the equivalence class of the directed acyclic graph underlying a Bayesian network model, and a variety of methods exist to tackle the problem. Under certain assumptions, the popular PC algorithm can consistently recover the correct equivalence class by reverse-engineering the conditional independence (CI) relationships holding in the variable distribution. Here, we propose the dual PC algorithm, a novel scheme to carry out the CI tests within the PC algorithm by leveraging the inverse relationship between covariance and precision matrices. By exploiting block matrix inversions we can simultaneously perform tests on partial correlations of complementary (or dual) conditioning sets. The multiple CI tests of the dual PC algorithm proceed by first considering marginal and full-order CI relationships and progressively moving to central-order ones. Simulation studies show that the dual PC algorithm outperforms the classic PC algorithm both in terms of run time and in recovering the underlying network structure, even in the presence of deviations from Gaussianity.
翻译:学习Bayesian网络的图形结构是描述在许多复杂应用中生成数据机制的关键,但具有相当大的计算挑战。观察数据只能确定作为Bayesian网络模型基础的定向单程图的等值等级,并存在着解决这一问题的各种方法。根据某些假设,流行的PC算法可以始终通过反向工程来恢复正确的等值等级,即在变量分布中保持的有条件独立(CI)关系。在这里,我们提议了双重的PC算法,这是一个在PC算法中进行CI测试的新办法,它利用共变和精确矩阵之间的反向关系。通过利用区块矩阵反向矩阵,我们可以同时进行互补(或双向)调节组合部分相关性的测试。双重PC算法的多重CI测试首先考虑边际和全序的CI关系,然后逐步转向中央顺序。模拟研究表明,双重的PC算法在运行时间和修复基本网络结构方面都超越了典型的PC算法,即使存在偏离高斯提提的特征。