This paper approaches the unsupervised learning problem by minimizing the second-order Wasserstein loss (the $W_2$ loss) through a distribution-dependent ordinary differential equation (ODE), whose dynamics involves the Kantorovich potential associated with the true data distribution and a current estimate of it. A main result shows that the time-marginal laws of the ODE form a gradient flow for the $W_2$ loss, which converges exponentially to the true data distribution. An Euler scheme for the ODE is proposed and it is shown to recover the gradient flow for the $W_2$ loss in the limit. An algorithm is designed by following the scheme and applying persistent training, which naturally fits our gradient-flow approach. In both low- and high-dimensional experiments, our algorithm outperforms Wasserstein generative adversarial networks by increasing the level of persistent training appropriately.
翻译:本文通过一个依赖于分布的常微分方程(ODE)来最小化二阶Wasserstein损失($W_2$损失),从而解决无监督学习问题。该ODE的动力学涉及与真实数据分布及其当前估计相关的Kantorovich势。一个主要结果表明,该ODE的时间边缘律构成了$W_2$损失的梯度流,并以指数速度收敛到真实数据分布。本文提出了该ODE的欧拉格式,并证明其在极限情况下恢复了$W_2$损失的梯度流。通过遵循该格式并应用持续训练,设计了一种算法,该算法自然地契合我们的梯度流方法。在低维和高维实验中,通过适当提高持续训练水平,我们的算法在性能上超越了Wasserstein生成对抗网络。