Accelerated gradient methods are a powerful optimization tool in machine learning and statistics but their development has traditionally been driven by heuristic motivations. Recent research, however, has demonstrated that these methods can be derived as discretizations of dynamical systems, which in turn has provided a basis for more systematic investigations, especially into the structure of those dynamical systems and their structure-preserving discretizations. In this work we introduce dynamical systems defined through a contact geometry which are not only naturally suited to the optimization goal but also subsume all previous methods based on geometric dynamical systems. These contact dynamical systems also admit a natural, robust discretization through geometric contact integrators. We illustrate these features in paradigmatic examples which show that we can indeed obtain optimization algorithms that achieve oracle lower bounds on convergence rates while also improving on previous proposals in terms of stability.
翻译:加速梯度方法在机器学习和统计方面是一个强大的优化工具,但其发展传统上是由超自然动机驱动的。然而,最近的研究表明,这些方法可以作为动态系统的分解而产生,这反过来又为更系统的调查提供了基础,特别是对这些动态系统的结构及其结构保持的分化。在这项工作中,我们引入了通过接触几何界定的动态系统,这些系统不仅自然适合优化目标,而且还包含所有先前基于几何动态系统的方法。这些接触动态系统也通过几何接触集成器接受一种自然的、稳健的分解。我们用典型的例子来说明这些特征,这些特征表明我们确实可以得到优化的算法,在趋同率上达到或达到较低的界限,同时在稳定方面改进先前的建议。