Recently, many machine learning optimizers have been analysed considering them as the asymptotic limit of some differential equations when the step size goes to zero. In other words, the optimizers can be seen as a finite difference scheme applied to a continuous dynamical system. But the major part of the results in the literature concerns constant step size algorithms. The main aim of this paper is to investigate the guarantees of the adaptive step size counterpart. In fact, this dynamical point of view can be used to design step size update rules, by choosing a discretization of the continuous equation that preserves its most relevant features. In this work, we analyse this kind of adaptive optimizers and prove their Lyapunov stability and convergence properties for any choice of hyperparameters. At the best of our knowledge, this paper introduces for the first time the use of continuous selection theory from general topology to overcome some of the intrinsic difficulties due to the non constant and non regular step size policies. The general framework developed gives many new results on adaptive and constant step size Momentum/Heavy-Ball and p-GD algorithms.
翻译:暂无翻译