In this book chapter, we briefly describe the main components that constitute the gradient descent method and its accelerated and stochastic variants. We aim at explaining these components from a mathematical point of view, including theoretical and practical aspects, but at an elementary level. We will focus on basic variants of the gradient descent method and then extend our view to recent variants, especially variance-reduced stochastic gradient schemes (SGD). Our approach relies on revealing the structures presented inside the problem and the assumptions imposed on the objective function. Our convergence analysis unifies several known results and relies on a general, but elementary recursive expression. We have illustrated this analysis on several common schemes.
翻译:在这本书的章节中,我们简要地描述了构成梯度下降法的主要组成部分及其加速和随机变异。我们的目的是从数学角度解释这些组成部分,包括理论和实践方面,但以初级水平为基础。我们将侧重于梯度下降法的基本变体,然后将我们的观点扩大到最近的变体,特别是差异减少的变种梯度计划。我们的方法取决于揭示问题内部的结构和对客观功能的假设。我们的趋同分析将若干已知结果统一起来,并依靠一种一般但基本的累进式表达。我们已经将这一分析展示在几个共同的方案中。