Particle-based variational inference (VI) minimizes the KL divergence between model samples and the target posterior with gradient flow estimates. With the popularity of Stein variational gradient descent (SVGD), the focus of particle-based VI algorithms has been on the properties of functions in Reproducing Kernel Hilbert Space (RKHS) to approximate the gradient flow. However, the requirement of RKHS restricts the function class and algorithmic flexibility. This paper offers a general solution to this problem by introducing a functional regularization term that encompasses the RKHS norm as a special case. This allows us to propose a new particle-based VI algorithm called preconditioned functional gradient flow (PFG). Compared to SVGD, PFG has several advantages. It has a larger function class, improved scalability in large particle-size scenarios, better adaptation to ill-conditioned distributions, and provable continuous-time convergence in KL divergence. Additionally, non-linear function classes such as neural networks can be incorporated to estimate the gradient flow. Our theory and experiments demonstrate the effectiveness of the proposed framework.
翻译:粒子变分推断(Particle-based variational inference, VI) 通过梯度流估计最小化模型采样和目标后验概率分布的KL散度。由于Stein变分梯度下降(SVGD)的普及,粒子变分推断算法的重点一直是RKHS(再生核希尔伯特空间)中的函数性质来逼近梯度流。然而,RKHS的要求限制了函数类别和算法灵活性。本文通过引入一个泛函正则化项来提供这个问题的一般解决方案,该正则化项包括RKHS范数作为特殊情况。这使得我们能够提出一种新的粒子变分推断算法,称为预处理函数梯度流(Preconditioned functional gradient flow, PFG)。与SVGD相比,PFG有几个优点。它有更大的函数类别,在大规模粒子 size 场景下有更好的可扩展性,更好地适应于病态分布,在KL散度中具有可证的连续时间收敛性。此外,可以加入非线性函数类似于神经网络来估计梯度流。理论和实验证明了所提出框架的有效性。