The convergence rates for convex and non-convex optimization methods depend on the choice of a host of constants, including step sizes, Lyapunov function constants and momentum constants. In this work we propose the use of factorial powers as a flexible tool for defining constants that appear in convergence proofs. We list a number of remarkable properties that these sequences enjoy, and show how they can be applied to convergence proofs to simplify or improve the convergence rates of the momentum method, accelerated gradient and the stochastic variance reduced method (SVRG).
翻译:对于凸优化和非凸优化算法,收敛速率依赖于许多常数的选择,包括步长、李雅普诺夫函数常数和动量常数。在本文中,我们提出了使用阶乘幂作为定义收敛证明中出现的常数的灵活工具。我们列举了许多这些数列所具有的显著属性,并展示了如何将它们应用于动量方法、加速梯度以及随机方差减小方法(SVRG)的收敛证明,从而简化或改善了收敛速率。