Variational inference is a technique for approximating intractable posterior distributions in order to quantify the uncertainty of machine learning. Although the unimodal Gaussian distribution is usually chosen as a parametric distribution, it hardly approximates the multimodality. In this paper, we employ the Gaussian mixture distribution as a parametric distribution. A main difficulty of variational inference with the Gaussian mixture is how to approximate the entropy of the Gaussian mixture. We approximate the entropy of the Gaussian mixture as the sum of the entropy of the unimodal Gaussian, which can be analytically calculated. In addition, we theoretically analyze the approximation error between the true entropy and approximated one in order to reveal when our approximation works well. Specifically, the approximation error is controlled by the ratios of the distances between the means to the sum of the variances of the Gaussian mixture. Furthermore, it converges to zero when the ratios go to infinity. This situation seems to be more likely to occur in higher dimensional parametric spaces because of the curse of dimensionality. Therefore, our result guarantees that our approximation works well, for example, in neural networks that assume a large number of weights.
翻译:为了量化机器学习的不确定性,近似棘手的后表分布法是一种方法。虽然单式高斯分布通常被选为参数分布,但几乎不近似多式联运。在本文中,我们使用高斯混合分布作为参数分布法。高斯混合物的变异推断法主要困难在于如何接近高斯混合物的倍数。我们将高斯混合物的导体与单式高斯混合物的倍数之和相近接近,这可以分析计算出来。此外,我们从理论上分析真实的导体与近似值之间的近似差,以便在我们接近效果良好时显示。具体地说,近似误差是由各种手段与高斯混合物差异之和之和之和之间的距离之比所控制。此外,当比值达到精确度时,我们将高斯混合物的倍数相接近为零。这种情况似乎更可能发生在高度的单式高斯高斯混合物的倍数之和,这可以进行分析计算。此外,我们理论上分析真实的正数和近似于我们近度网络之间的近距离差差差差,因为我们假设了高度网络的高度。