Variational inference (VI) seeks to approximate a target distribution $\pi$ by an element of a tractable family of distributions. Of key interest in statistics and machine learning is Gaussian VI, which approximates $\pi$ by minimizing the Kullback-Leibler (KL) divergence to $\pi$ over the space of Gaussians. In this work, we develop the (Stochastic) Forward-Backward Gaussian Variational Inference (FB-GVI) algorithm to solve Gaussian VI. Our approach exploits the composite structure of the KL divergence, which can be written as the sum of a smooth term (the potential) and a non-smooth term (the entropy) over the Bures-Wasserstein (BW) space of Gaussians endowed with the Wasserstein distance. For our proposed algorithm, we obtain state-of-the-art convergence guarantees when $\pi$ is log-smooth and log-concave, as well as the first convergence guarantees to first-order stationary solutions when $\pi$ is only log-smooth.
翻译:变分推断(VI)旨在通过易于处理的分布族的元素近似目标分布π。 统计学和机器学习领域的重点是高斯VI,它通过在高斯空间上最小化到π的KL散度来近似π。 在这项工作中,我们开发了(随机)前向-后向高斯变分推断(FB-GVI)算法来解决高斯VI问题。 我们的方法利用了KL散度的组合结构,可以将其写作Gaussians在带有Wasserstein距离的Bures-Wasserstein (BW)空间上的平滑项(势函数)和非平滑项(熵)的总和。 对于我们提出的算法,在π为log-smooth和log-concave时,我们得到了最先进的收敛保证,以及当π只是log-smooth时,我们提供了首个收敛保证并且可到达一阶稳定解。