The recursive and hierarchical structure of full rooted trees is used in various areas such as data compression, image processing, and machine learning. In most of these studies, the full rooted tree is not a random variable. It causes a problem of model selection to avoid overfitting. One method to solve it is to assume a prior distribution on the full rooted trees. It enables us to avoid overfitting based on the Bayes decision theory. For example, by assigning a low prior probability on a complex model, the MAP estimator prevents the overfitting. Further, we can avoid it by averaging all the models weighted by their posteriors. In this paper, we propose a probability distribution on a set of full rooted trees. Its parametric representation is well suited to calculate the properties of our distribution by recursive functions: the mode, the expectation, the posterior distribution, etc. Although some previous studies have proposed such distributions, they are for specific applications. Therefore, we extract the mathematically essential part of them and derive new generalized methods to calculate the expectation, the posterior distribution, etc.
翻译:在诸如数据压缩、图像处理和机器学习等不同领域,完全根植的树木的递归和等级结构被使用。在大多数这些研究中,完全根植的树不是随机的变量。它造成模型选择问题以避免过度配制。一种解决办法是假定在完全根植的树木上事先分配。它使我们能够避免基于贝雅人决定理论的过度配制。例如,通过在复杂的模型上分配一个低的先前概率,MAP 估计器可以防止过度配制。此外,我们可以通过平均使用其后裔加权的所有模型来避免它。在本文中,我们提出一套完全根植的树的概率分布。它的参数表示非常适合通过循环函数来计算我们分布的特性:模式、预期、后世分布等等。虽然以前的一些研究已经提议了这种配制,但是它们是用于特定应用的。因此,我们提取了它们的数学基本部分,并得出新的通用方法来计算预期值、后世分布等。