Along with Markov chain Monte Carlo (MCMC) methods, variational inference (VI) has emerged as a central computational approach to large-scale Bayesian inference. Rather than sampling from the true posterior $\pi$, VI aims at producing a simple but effective approximation $\hat \pi$ to $\pi$ for which summary statistics are easy to compute. However, unlike the well-studied MCMC methodology, algorithmic guarantees for VI are still relatively less well-understood. In this work, we propose principled methods for VI, in which $\hat \pi$ is taken to be a Gaussian or a mixture of Gaussians, which rest upon the theory of gradient flows on the Bures--Wasserstein space of Gaussian measures. Akin to MCMC, it comes with strong theoretical guarantees when $\pi$ is log-concave.
翻译:伴随马尔科夫蒙特卡洛(MCMC)方法,变分推断(VI)已经成为大规模贝叶斯推断的中心计算方法之一。VI不像MCMC一样从真实后验$\pi$中采样,而是旨在产生一个简单但有效的近似$\hat \pi$来计算摘要统计信息。但是,与研究充分的MCMC方法相比,VI的算法保障仍然相对较差。在这项工作中,我们提出了一种VI的原则方法,其中$\hat \pi$被采取为高斯混合物或混合高斯物,并基于Gaussian测量空间上的Bures--Wasserstein空间的梯度流理论。类似于MCMC,当$\pi$对数凹时,它具有强大的理论保证。