瓦西尔斯坦 Proximal 梯度算法 (The Wasserstein Proximal Gradient Algorithm)

Wasserstein gradient flows are continuous time dynamics that define curves of steepest descent to minimize an objective function over the space of probability measures (i.e., the Wasserstein space). This objective is typically a divergence w.r.t. a fixed target distribution. In recent years, these continuous time dynamics have been used to study the convergence of machine learning algorithms aiming at approximating a probability distribution. However, the discrete-time behavior of these algorithms might differ from the continuous time dynamics. Besides, although discretized gradient flows have been proposed in the literature, little is known about their minimization power. In this work, we propose a Forward Backward (FB) discretization scheme that can tackle the case where the objective function is the sum of a smooth and a nonsmooth geodesically convex terms. Using techniques from convex optimization and optimal transport, we analyze the FB scheme as a minimization algorithm on the Wasserstein space. More precisely, we show under mild assumptions that the FB scheme has convergence guarantees similar to the proximal gradient algorithm in Euclidean spaces.

翻译：Vasserstein 梯度流是连续的时间动态,它定义了最陡峭的下降曲线,以最大限度地减少概率测量空间(即瓦西斯坦空间)的客观功能。这个目标通常是一个差异 w.r.t. 固定的目标分布。近年来,这些连续的时间动态被用于研究旨在接近概率分布的机器学习算法的趋同。然而,这些算法的离散时间行为可能与连续的时间动态有所不同。此外,尽管文献中提出了离散的梯度流,但对于其最小化能力却知之甚少。在这个工作中,我们提出了一个向后(FB)离散方案,它可以处理这样的案例,即目标函数是平滑和非移动的地磁共振条件之和。我们利用从曲线优化和最佳运输中得出的技术,分析FB方案作为瓦塞斯坦空间的最小化算法。更确切地说,我们根据温和的假设,FB方案保证其趋同于Euclidean空间的直角梯度梯度梯度算法。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020】基于模型的强化学习方法教程，279页ppt

专知会员服务

129+阅读 · 2020年7月20日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日