We study the widely known Cubic-Newton method in the stochastic setting and propose a general framework to use variance reduction which we call the helper framework. In all previous work, these methods were proposed with very large batches (both in gradients and Hessians) and with various and often strong assumptions. In this work, we investigate the possibility of using such methods without large batches and use very simple assumptions that are sufficient for all our methods to work. In addition, we study these methods applied to gradient-dominated functions. In the general case, we show improved convergence (compared to first-order methods) to an approximate local minimum, and for gradient-dominated functions, we show convergence to approximate global minima.
翻译:我们研究了在随机环境中广为人知的立方-牛顿方法,并提出了一个使用差异减少的总框架,我们称之为帮助框架。在以往的所有工作中,这些方法都是与大批量(梯度和海珊)以及各种往往强有力的假设一起提出的。在这项工作中,我们调查了在没有大批量的情况下使用这类方法的可能性,并使用了非常简单的假设,这些假设足以使我们所有工作方法都发挥作用。此外,我们研究了适用于梯度主导功能的这些方法。在一般情况下,我们显示出(与一级方法相比)与近似本地最低值的趋同程度,对于梯度主导功能,我们表现出接近全球迷你的趋同性。