This paper introduces the $(\alpha, \Gamma)$-descent, an iterative algorithm which operates on measures and performs $\alpha$-divergence minimisation in a Bayesian framework. This gradient-based procedure extends the commonly-used variational approximation by adding a prior on the variational parameters in the form of a measure. We prove that for a rich family of functions $\Gamma$, this algorithm leads at each step to a systematic decrease in the $\alpha$-divergence and derive convergence results. Our framework recovers the Entropic Mirror Descent algorithm and provides an alternative algorithm that we call the Power Descent. Moreover, in its stochastic formulation, the $(\alpha, \Gamma)$-descent allows to optimise the mixture weights of any given mixture model without any information on the underlying distribution of the variational parameters. This renders our method compatible with many choices of parameters updates and applicable to a wide range of Machine Learning tasks. We demonstrate empirically on both toy and real-world examples the benefit of using the Power descent and going beyond the Entropic Mirror Descent framework, which fails as the dimension grows.
翻译:本文引入了 $( alpha, \ Gamma) 世系, 这是一种迭代算法, 以措施为主, 在巴伊西亚框架中运行并运行 $\ alpha$- divegrence 最小化 。 基于梯度的程序扩展了常用的变异近似值, 以测量形式在变异参数上增加一个前缀。 我们证明, 对于功能的丰富组合 $\ Gamma$, 这种算法每一步都会导致 $\ alpha$- divegence 的系统性下降, 并产生趋同结果。 我们的框架恢复了 Entrapic 镜源算法, 并提供了一种替代算法, 我们称之为权力源的替代算法。 此外, 在其随机配方中, $( alpha,\ Gamma) \ 世系允许优化任何特定混合物模型的混合权重, 而不提供关于变异参数基本分布的任何信息。 使我们的方法与许多参数选择的参数更新和适用于广泛的机器学习任务相匹配。 我们从实验上展示了 和真实世界的范例实例, 使用电源和变形图框架。