Bayesian networks are popular probabilistic models that capture the conditional dependencies among a set of variables. Inference in Bayesian networks is a fundamental task for answering probabilistic queries over a subset of variables in the data. However, exact inference in Bayesian networks is \NP-hard, which has prompted the development of many practical inference methods. In this paper, we focus on improving the performance of the junction-tree algorithm, a well-known method for exact inference in Bayesian networks. In particular, we seek to leverage information in the workload of probabilistic queries to obtain an optimal workload-aware materialization of junction trees, with the aim to accelerate the processing of inference queries. We devise an optimal pseudo-polynomial algorithm to tackle this problem and discuss approximation schemes. Compared to state-of-the-art approaches for efficient processing of inference queries via junction trees, our methods are the first to exploit the information provided in query workloads. Our experimentation on several real-world Bayesian networks confirms the effectiveness of our techniques in speeding-up query processing.
翻译:贝叶斯网络是流行的概率模型,可以捕捉一组变量之间的有条件依赖性。贝叶斯网络的推论是回答数据中一组变量的概率问题的一项基本任务。然而,巴伊斯网络的精确推论是硬的,这促使开发了许多实际推论方法。在本文中,我们侧重于改进交叉树算法的性能,这是贝伊斯网络中一种众所周知的精确推论方法。特别是,我们试图利用概率查询工作量中的信息,以获得连接树木的最佳工作量和物质化,目的是加速推断查询的处理。我们设计了一种最佳的假极化算法来解决这一问题并讨论近似方法。与通过连接树有效处理推断问题的最新方法相比,我们的方法是首先利用查询工作量中提供的信息。我们对几个真实世界巴伊斯网络的实验证实了我们加快查询处理技术的有效性。