Learning the structure of dependence relations between variables is a pervasive issue in the statistical literature. A directed acyclic graph (DAG) can represent a set of conditional independences, but different DAGs may encode the same set of relations and are indistinguishable using observational data. Equivalent DAGs can be collected into classes, each represented by a partially directed graph known as essential graph (EG). Structure learning directly conducted on the EG space, rather than on the allied space of DAGs, leads to theoretical and computational benefits. Still, the majority of efforts in the literature has been dedicated to Gaussian data, with less attention to methods designed for multivariate categorical data. We then propose a Bayesian methodology for structure learning of categorical EGs. Combining a constructive parameter prior elicitation with a graph-driven likelihood decomposition, we derive a closed-form expression for the marginal likelihood of a categorical EG model. Asymptotic properties are studied, and an MCMC sampler scheme developed for approximate posterior inference. We evaluate our methodology on both simulated scenarios and real data, with appreciable performance in comparison with state-of-the-art methods.
翻译:在统计文献中,学习各变量之间依赖关系的结构是一个普遍的问题。定向循环图(DAG)可以代表一系列有条件的独立,但不同的DAG可以将同一组关系编码,使用观测数据是无法区分的。等效DAG可以收集到各类中,每个类中都有一个部分定向图(称为基本图(EG))代表的等效DAG;直接在EG空间而不是DAG的相联空间上学习的结构可以产生理论和计算效益。尽管如此,文献中的大部分努力都用于高斯数据,而较少注意为多变量绝对数据设计的方法。我们然后提出一种巴伊西亚方法,用于对绝对EGE进行结构学习。将建设性参数与图表驱动的可能性分解法相结合,我们为绝对的EG模型的边际可能性得出一种封闭式表达方式。研究了系统特征,并为近似远误判而开发了MCMC样本计划。我们评估了模拟情景和真实数据的方法,与状态比较中可观的绩效。