We present a new class of adaptive stochastic optimization algorithms, which overcomes many of the known shortcomings of popular adaptive optimizers that are currently used for the fine tuning of artificial neural networks (ANNs). Its underpinning theory relies on advances of Euler's polygonal approximations for stochastic differential equations (SDEs) with monotone coefficients. As a result, it inherits the stability properties of tamed algorithms, while it addresses other known issues, e.g. vanishing gradients in ANNs. In particular, we provide an nonasymptotic analysis and full theoretical guarantees for the convergence properties of an algorithm of this novel class, which we named TH$\varepsilon$O POULA (or, simply, TheoPouLa). Finally, several experiments are presented with different types of ANNs, which show the superior performance of TheoPouLa over many popular adaptive optimization algorithms.
翻译:我们提出了一种新的适应性随机优化算法,它克服了目前用于微调人工神经网络(ANNs)的流行性适应性优化优化器的许多已知缺陷。它的基础理论依赖于Euler多边形近似值的进步,即单质系数的随机偏差方程式(SDEs ) 。 结果,它继承了调制算法的稳定性,而它处理的是其他已知问题,例如消除ANNs的梯度。特别是,我们为这个小类的算法的趋同性提供了非同步性分析和充分的理论保障,我们称之为TH$\varepslon$O POUULA(简称TheoPouLA ) 。 最后,与不同类型的ANNs进行了几次实验,这些实验显示了TheoPouLa相对于许多流行的适应性优化算法的优异性表现。