Iterative gradient-based algorithms have been increasingly applied for the training of a broad variety of machine learning models including large neural-nets. In particular, momentum-based methods, with accelerated learning guarantees, have received a lot of attention due to their provable guarantees of fast learning in certain classes of problems and multiple algorithms have been derived. However, properties for these methods hold true only for constant regressors. When time-varying regressors occur, which is commonplace in dynamic systems, many of these momentum-based methods cannot guarantee stability. Recently, a new High-order Tuner (HT) was developed and shown to have 1) stability and asymptotic convergence for time-varying regressors and 2) non-asymptotic accelerated learning guarantees for constant regressors. These results were derived for a linear regression framework producing a quadratic loss function. In this paper, we extend and discuss the results of this same HT for general convex loss functions. Through the exploitation of convexity and smoothness definitions, we establish similar stability and asymptotic convergence guarantees. Additionally we conjecture that the HT has an accelerated convergence rate. Finally, we provide numerical simulations supporting the satisfactory behavior of the HT algorithm as well as the conjecture of accelerated learning.
翻译:在培训包括大型神经网在内的各种机器学习模型时,越来越多地采用基于梯度的算法,特别是,由于在某些类别的问题和多种算法中,以动力为基础的方法具有快速学习的可证实的保证,因此受到了很多关注,因为在某些类别的问题和多重算法中,这些方法的特性可以保证快速学习,但是,这些方法的特性只适用于不断递减的递减者。当发生时间变化的递减者时,许多这些基于动力的方法无法保证稳定性。最近,开发了一个新的高阶图纳(HT),并显示:(1) 时间变化的递减者具有稳定性和无阻趋同性,(2) 不断递减者具有不稳加速学习的保证。这些结果用于产生二次递减功能的线性回归框架。在本文中,我们扩展并讨论该HT的结果,用于一般等离值损失功能。通过利用相近和平稳的定义,我们建立了类似的稳定性和抑制性趋同的聚合保证。此外,我们推测HT的快速加速学习率,从而加速了数字趋同性的行为的加速。