Wasserstein GANs with Gradient Penalty (WGAN-GP) are a very popular method for training generative models to produce high quality synthetic data. While WGAN-GP were initially developed to calculate the Wasserstein 1 distance between generated and real data, recent works (e.g. [23]) have provided empirical evidence that this does not occur, and have argued that WGAN-GP perform well not in spite of this issue, but because of it. In this paper we show for the first time that WGAN-GP compute the minimum of a different optimal transport problem, the so-called congested transport [7]. Congested transport determines the cost of moving one distribution to another under a transport model that penalizes congestion. For WGAN-GP, we find that the congestion penalty has a spatially varying component determined by the sampling strategy used in [12] which acts like a local speed limit, making congestion cost less in some regions than others. This aspect of the congested transport problem is new, in that the congestion penalty turns out to be unbounded and depends on the distributions to be transported, and so we provide the necessary mathematical proofs for this setting. One facet of our discovery is a formula connecting the gradient of solutions to the optimization problem in WGAN-GP to the time averaged momentum of the optimal mass flow. This is in contrast to the gradient of Kantorovich potentials for the Wasserstein 1 distance, which is just the normalized direction of flow. Based on this and other considerations, we speculate on how our results explain the observed performance of WGAN-GP. Beyond applications to GANs, our theorems also point to the possibility of approximately solving large scale congested transport problems using neural network techniques.
翻译:瓦塞斯坦 GAN 与 梯度惩罚 ( WGAN- GP) 的 瓦塞斯坦 GAN 是一个非常受欢迎的方法, 用于培训基因化模型, 以生成高质量的合成数据。 虽然 WGAN- GP 最初是用来计算生成数据与真实数据之间瓦塞斯坦 1 距离的瓦塞斯坦 1 距离的, 但最近的工作( 例如 [ 23] ) 提供了经验证据, 证明没有发生这种情况, 并声称WGAN- GP 的运行效果并不好, 但也因为这个问题, 而在本文中, WGAN- GP 首次计算了不同最佳运输问题的最低比重, 所谓的静态运输( 7 ) 。 混凝土运输决定了将一个分布方向移到另一个方向, 从而惩罚了一个方向, 将一个分布方向移到另一个方向, 也就是 将OAN- GG GG 平均速度战略决定了一个空间差异部分。 我们发现, 这就像一个地方速度限制, 使得某些区域的凝缩成本比其它的运输问题是新的, 。 我们的熔化的腐蚀变换到一个方向, 。