Minimizing functionals in the space of probability distributions can be done with Wasserstein gradient flows. To solve them numerically, a possible approach is to rely on the Jordan-Kinderlehrer-Otto (JKO) scheme which is analogous to the proximal scheme in Euclidean spaces. However, it requires solving a nested optimization problem at each iteration, and is known for its computational challenges, especially in high dimension. To alleviate it, very recent works propose to approximate the JKO scheme leveraging Brenier's theorem, and using gradients of Input Convex Neural Networks to parameterize the density (JKO-ICNN). However, this method comes with a high computational cost and stability issues. Instead, this work proposes to use gradient flows in the space of probability measures endowed with the sliced-Wasserstein (SW) distance. We argue that this method is more flexible than JKO-ICNN, since SW enjoys a closed-form differentiable approximation. Thus, the density at each step can be parameterized by any generative model which alleviates the computational burden and makes it tractable in higher dimensions.
翻译:最小化概率分布空间中的功能可以用瓦塞尔斯坦梯度流来最小化。 要从数字上解决这些功能, 一种可能的办法就是依赖与欧几里德空间的近似方案类似的约旦- Kinderleherder- Ottto(JKO) 方案。 但是, 需要解决每个循环点的嵌套优化问题, 并且以其计算挑战, 特别是高维度的计算挑战而著称。 为了减轻这种挑战, 最近的工作提议接近JKO 方案, 利用布雷尼埃的理论, 并使用输入 Convex神经网络的梯度来参数化密度( JKO- ICNN ) 。 然而, 这种方法会产生很高的计算成本和稳定性问题。 相反, 这项工作提议在与切片- Wasserstein( WW) 距离相配的概率测量空间使用梯度流。 我们争辩说, 这种方法比 JKO- ICNNN 更灵活, 因为 SW 拥有一种封闭式的不同近度。 因此, 每一步的密度可以通过任何基因化模型进行参数比较。