This paper approaches the unsupervised learning problem by gradient descent in the space of probability density functions. Our main result shows that along the gradient flow induced by a distribution-dependent ordinary differential equation (ODE), the unknown data distribution emerges as the long-time limit of this flow of densities. That is, one can uncover the data distribution by simulating the distribution-dependent ODE. Intriguingly, we find that the simulation of the ODE is equivalent to the training of generative adversarial networks (GANs). The GAN framework, by definition a non-cooperative game between a generator and a discriminator, can therefore be viewed alternatively as a cooperative game between a navigator and a calibrator (in collaboration to simulate the ODE). At the theoretic level, this new perspective simplifies the analysis of GANs and gives new insight into their performance. To construct a solution to the distribution-dependent ODE, we first show that the associated nonlinear Fokker-Planck equation has a unique weak solution, using the Crandall-Liggett theorem for differential equations in Banach spaces. From this solution to the Fokker-Planck equation, we construct a unique solution to the ODE, relying on Trevisan's superposition principle. The convergence of the induced gradient flow to the data distribution is obtained by analyzing the Fokker-Planck equation.
翻译:本文通过概率密度函数空间的梯度下降处理不受监督的学习问题。 我们的主要结果表明, 在基于分布的普通差分方程( ODE) 引发的梯度流中, 未知的数据分布作为密度流的长期极限出现。 也就是说, 可以通过模拟基于分布的 ODE 来发现数据分布。 有趣的是, 我们发现对 ODE 的模拟相当于对基因对抗网络( GANs) 的培训。 因此, GAN 框架, 通过定义发电机与歧视者之间不合作的游戏, 可以被看作导航器与校准器( 模拟 ODE ) 之间的合作游戏。 在理论层面上, 这个新视角可以简化对基于分布的 ODE 的分析, 并给其表现带来新的洞察。 为了构建一个依赖分布的 ODE 的解决方案, 我们首先显示, 相关的非线性对抗网络( Gokker- Planck) 等方程式具有独特的弱点, 使用 Crandall- Liggget 论调方程式在Banach 方程式中进行差异式方程式的校准( 模拟Oral) 原则, 从这个解决方案到FROdal- developal- tral 的模型, 从这个解决方案, 的模型流流流到一个独特的模型到一个独特的解决方案到一个独特的的模型, 的滚化的模型, 从这个解决方案, 的模型的模型的滚式的模型到一个特殊的模型的模型的模型的模型的模型的模型的模型。