As an example of the nonlinear Fokker-Planck equation, the mean field Langevin dynamics attracts attention due to its connection to (noisy) gradient descent on infinitely wide neural networks in the mean field regime, and hence the convergence property of the dynamics is of great theoretical interest. In this work, we give a simple and self-contained convergence rate analysis of the mean field Langevin dynamics with respect to the (regularized) objective function in both continuous and discrete time settings. The key ingredient of our proof is a proximal Gibbs distribution $p_q$ associated with the dynamics, which, in combination of techniques in [Vempala and Wibisono (2019)], allows us to develop a convergence theory parallel to classical results in convex optimization. Furthermore, we reveal that $p_q$ connects to the duality gap in the empirical risk minimization setting, which enables efficient empirical evaluation of the algorithm convergence.
翻译:作为非线性Fokker-Planck等式的一个实例,平均的Langevin动态领域因其与平均野外系统中无限宽度神经网络的(噪音)梯度下降相关而引起注意,因此动态的趋同性在理论上引起了极大的兴趣。在这项工作中,我们对(正规的)连续和离散时间环境中的(正规的)目标功能的平均值Langevin动态进行了简单和自成一体的趋同率分析。我们证据的关键成分是,与该动态相关的快速吉布斯分布 $p_q$,这使我们能够结合[Vempala和Wibisono(2019年)]的技术,发展一种与典型的康韦克斯优化结果平行的趋同理论。此外,我们发现,$q美元与经验风险最小化环境中的双重性差距有关,从而能够有效地对算法趋同性进行经验性评估。