In this paper we derive an explicit formula for calculating the marginal likelihood of a given factorization of a categorical dataset. Since the marginal likelihood is proportional to the posterior probability of the factorization, these likelihoods can be used to order all possible factorizations and select the "best" way to factor the overall distribution from which the dataset is drawn. The best factorization can then be used to construct a Bayes classifier which benefits from factoring out mutually independent sets of variables.
翻译:在本文中,我们为计算绝对数据集某一因素化的边际可能性提出了一个明确的公式。由于边际可能性与系数化的后继概率成正比,这些可能性可以用来命令所有可能的系数化,并选择“最佳”方式来计算数据集所根据的总体分布。然后,最佳的系数化可以用来构建一个贝叶斯分类器,该分类器从将相互独立的变量组合考虑在内中受益。