In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) structure of Bayesian networks, from data. Defining such a distribution is very challenging, due to the combinatorially large sample space, and approximations based on MCMC are often required. Recently, a novel class of probabilistic models, called Generative Flow Networks (GFlowNets), have been introduced as a general framework for generative modeling of discrete and composite objects, such as graphs. In this work, we propose to use a GFlowNet as an alternative to MCMC for approximating the posterior distribution over the structure of Bayesian networks, given a dataset of observations. Generating a sample DAG from this approximate distribution is viewed as a sequential decision problem, where the graph is constructed one edge at a time, based on learned transition probabilities. Through evaluation on both simulated and real data, we show that our approach, called DAG-GFlowNet, provides an accurate approximation of the posterior over DAGs, and it compares favorably against other methods based on MCMC or variational inference.
翻译:在Bayesian结构学习中,我们有兴趣从数据中推断出Bayesian网络定向环形图(DAG)结构的分布。定义这种分布非常具有挑战性,因为样本空间的组合性很大,往往需要基于MCMC的近似值。最近,引入了新型的概率模型类别,称为General Flow Nets(GEFlowNets),作为离散和复合物体(如图)的基因化模型的一般框架。在这项工作中,我们提议使用GFlowNet,作为MMC的替代品,以近似于Bayesian网络结构的后端分布,提供一套观测数据集。从这种近似分布中生成一个样本DAG,被视为一个顺序决定问题,即根据所学的过渡概率,该图是一次构建一个边缘。我们通过对模拟和真实数据(如图)的评估,显示我们称为DAG-GFlowNet的方法,提供了远地点远地点相对于DAGGMs的近似性,并比照其他方法。