Bayesian Networks (BNs) have become increasingly popular over the last few decades as a tool for reasoning under uncertainty in fields as diverse as medicine, biology, epidemiology, economics and the social sciences. This is especially true in real-world areas where we seek to answer complex questions based on hypothetical evidence to determine actions for intervention. However, determining the graphical structure of a BN remains a major challenge, especially when modelling a problem under causal assumptions. Solutions to this problem include the automated discovery of BN graphs from data, constructing them based on expert knowledge, or a combination of the two. This paper provides a comprehensive review of combinatoric algorithms proposed for learning BN structure from data, describing 74 algorithms including prototypical, well-established and state-of-the-art approaches. The basic approach of each algorithm is described in consistent terms, and the similarities and differences between them highlighted. Methods of evaluating algorithms and their comparative performance are discussed including the consistency of claims made in the literature. Approaches for dealing with data noise in real-world datasets and incorporating expert knowledge into the learning process are also covered.
翻译:在过去几十年里,巴伊西亚网络(BNs)作为一种工具,在医学、生物学、流行病学、经济学和社会科学等各个领域的不确定性下进行推理,越来越受欢迎。这在现实世界地区尤其如此,我们试图根据假设证据回答复杂问题,以确定干预行动。然而,确定巴伊西亚网络的图形结构仍是一个重大挑战,特别是在根据因果关系假设模拟问题时。这个问题的解决方案包括从数据中自动发现BN图,以专家知识为基础构建这些图,或将两者结合起来。本文件全面审查了从数据中学习BN结构的拟议组合算法,描述了74种算法,包括原型、完善和最新方法。每种算法的基本方法都以前后一致的用语加以描述,突出了它们之间的相似和不同之处。讨论了评估算法及其比较性的方法,包括文献中索赔的一致性。还介绍了在现实世界数据集中处理数据噪音和将专家知识纳入学习过程的方法。