如何使用 KL 调制法为多变量高斯人构建具有明确界定的非信息规范限制的共性前缀 (How to use KL-divergence to construct conjugate priors, with well-defined non-informative limits, for the multivariate Gaussian)

The Wishart distribution is the standard conjugate prior for the precision of the multivariate Gaussian likelihood, when the mean is known -- while the normal-Wishart can be used when the mean is also unknown. It is however not so obvious how to assign values to the hyperparameters of these distributions. In particular, when forming non-informative limits of these distributions, the shape (or degrees of freedom) parameter of the Wishart must be handled with care. The intuitive solution of directly interpreting the shape as a pseudocount and letting it go to zero, as proposed by some authors, violates the restrictions on the shape parameter. We show how to use the scaled KL-divergence between multivariate Gaussians as an energy function to construct Wishart and normal-Wishart conjugate priors. When used as informative priors, the salient feature of these distributions is the mode, while the KL scaling factor serves as the pseudocount. The scale factor can be taken down to the limit at zero, to form non-informative priors that do not violate the restrictions on the Wishart shape parameter. This limit is non-informative in the sense that the posterior mode is identical to the maximum likelihood estimate of the Gaussian likelihood parameters.

翻译：Wishart 分布是用于精确多变 Gaussia 可能性的标准配置值, 当平均值为已知值时, 而普通 Wishart 也可以在平均值未知时使用。但是, 如何分配这些分布的超参数的值并不明显。特别是, 当形成这些分布的非信息限制时, 必须谨慎处理 Wishart 的形状( 自由度) 参数。直接将形状解读为伪计并使形状变为零的直觉解决方案, 正如一些作者所建议的那样, 违反了形状参数的限制。我们展示了如何在多变制高斯人之间使用 KL 调整率作为构建Westart 和普通 Wart 配置前的能量函数。当这些分布的突出特征被作为信息前奏使用时, 模式是这些配置的突出特征, 而 KL 缩放因子作为伪计。比例系数可以降低到零, 以非强化参数的形式在不违反对图像设定值的参数上不违反限制之前。这是KLVIart 格式上的最高概率。