Among dissimilarities between probability distributions, the Kernel Stein Discrepancy (KSD) has received much interest recently. We investigate the properties of its Wasserstein gradient flow to approximate a target probability distribution $\pi$ on $\mathbb{R}^d$, known up to a normalization constant. This leads to a straightforwardly implementable, deterministic score-based method to sample from $\pi$, named KSD Descent, which uses a set of particles to approximate $\pi$. Remarkably, owing to a tractable loss function, KSD Descent can leverage robust parameter-free optimization schemes such as L-BFGS; this contrasts with other popular particle-based schemes such as the Stein Variational Gradient Descent algorithm. We study the convergence properties of KSD Descent and demonstrate its practical relevance. However, we also highlight failure cases by showing that the algorithm can get stuck in spurious local minima.
翻译:在概率分布的不同之处, Kernel Stein Dismission( KSD) 最近引起了很大的兴趣。 我们调查了其瓦塞斯坦梯度流的特性, 以接近一个已知为正常化常数的 $\ mathbb{R ⁇ d$ 的目标概率分布 $\ pi$ 。 这导致一种直接可执行的、 确定性分数法, 从$\ pi$( 名为 KSD Emproom ) 取样到 $\ pion, 名为 KSD Emproom, 使用一套粒子, 大约为$\ pi$ 。 值得注意的是, KSD Emplemm可以利用强大的无参数优化计划, 如 L- BFGS ; 这与其他流行的粒子模型, 如 Stephen Variational Gradient Emplegle 运算法 。 我们研究了 KSDPrent 的趋同特性, 并展示其实际相关性。 但是, 我们通过显示算法可以被困在荒谬的地方微型 。