We describe algorithms for learning Bayesian networks from a combination of user knowledge and statistical data. The algorithms have two components: a scoring metric and a search procedure. The scoring metric takes a network structure, statistical data, and a user's prior knowledge, and returns a score proportional to the posterior probability of the network structure given the data. The search procedure generates networks for evaluation by the scoring metric. Previous work has concentrated on metrics for domains containing only discrete variables, under the assumption that data represents a multinomial sample. In this paper, we extend this work, developing scoring metrics for domains containing all continuous variables or a mixture of discrete and continuous variables, under the assumption that continuous data is sampled from a multivariate normal distribution. Our work extends traditional statistical approaches for identifying vanishing regression coefficients in that we identify two important assumptions, called event equivalence and parameter modularity, that when combined allow the construction of prior distributions for multivariate normal parameters from a single prior Bayesian network specified by a user.
翻译:我们从用户知识和统计数据的组合中描述学习巴伊西亚网络的算法。 算法有两个组成部分: 评分衡量标准和搜索程序。 评分衡量标准采用网络结构、 统计数据和用户先前的知识, 并返回一个与数据所提供的网络结构的后方概率成比例的得分。 搜索程序生成了通过评分衡量标准进行评估的网络。 先前的工作集中在只包含离散变量的域的计量标准上, 假设数据代表一个多数值样本。 在本文中, 我们扩展了这项工作, 为包含所有连续变量或离散和连续变量混合的域开发评分标准, 假设连续数据是从多变量正常分布中抽样的。 我们的工作扩展了确定倒回归系数的传统统计方法, 因为我们确定了两个重要假设, 称为事件等值和参数模块性, 当结合使用用户指定的单一的Bayesian 网络, 就可以为多变量正常参数进行先前的分布。