A Bayesian network is a multivariate (potentially high dimensional) probabilistic model, which is often formed by combining lower dimensional components. Inference (computation of conditional probabilities) is based on message passing algorithms that utilize conditional independence structures. In Bayesian networks for discrete variables with finite state spaces, there is a fundamental problem in high dimensions: A discrete distribution is represented by a table of values, and in high dimensions such tables can become prohibitively large. In inference, such tables must be multiplied which can lead to even larger tables. The sparta package tries to meet this challenge by introducing new methods that efficiently handles multiplication and marginalization of sparse tables. The package was written in the R programming language and is freely available from the Comprehensive R Archive Network (CRAN). The companion package jti, also on CRAN, was developed to showcase the potential of sparta in connection to the Junction Tree Algorithm. We show that sparta outperforms existing methods for high-dimensional sparse probability tables. Furthermore, we demonstrate, that sparta can handle problems which are otherwise infeasible due to lack of computer memory.
翻译:Bayesian 网络是一个多变量( 可能高维) 概率模型, 通常由低维组件组合而成。 推断( 有条件概率的计算) 是基于使用有条件独立结构的信息传递算法。 在使用条件独立的结构的Bayesian 网络中, 存在一个高维方面的根本问题: 离散分布由数值表代表, 在高维中, 这种表格可能变得令人望而却步。 推断, 这种表格必须乘以能够导致更大型表格的方块。 斯parta 软件包试图通过引入能有效处理稀散表格的倍增和边缘化的新方法来迎接这一挑战。 软件包是用R 编程语言撰写的, 并且可以从综合 R 档案网络( CRAN) 免费获得。 配套的软件包jti 也是在 CRAN 上开发的, 以展示与 Junction 树 Algorithm 相关的斯巴方块潜力。 我们显示, 方块可以超越现有方法, 用于高维度稀少概率表格。 此外, 我们证明, parta 可以处理计算机的存储问题。