In this work, we study the idea of variance reduction applied to adaptive stochastic mirror descent algorithms in the nonsmooth nonconvex finite-sum optimization problems. We propose a simple yet generalized adaptive mirror descent algorithm with variance reduction named SVRAMD and provide its convergence analysis in different settings. We prove that variance reduction reduces the SFO complexity of most adaptive mirror descent algorithms and accelerates their convergence. In particular, our general theory implies that variance reduction can be applied to algorithms using time-varying step sizes and self-adaptive algorithms such as AdaGrad and RMSProp. Moreover, the convergence rates of SVRAMD recover the best existing rates of non-adaptive variance reduced mirror descent algorithms. We check the validity of our claims using experiments in deep learning.
翻译:在这项工作中,我们研究了在非单向非相通非相通的有限和优化问题中适用于适应性随机镜像下沉算法的减少差异概念。我们提出了一个简单而普遍的适应性镜像下沉算法,名为SVRAMD,并在不同的环境下提供其趋同分析。我们证明,减少差异减少了大多数适应性镜面下沉算法的SFO复杂性,并加快了它们的趋同速度。特别是,我们的一般理论意味着,减少差异可以适用于使用时间变化的步数和自我适应算法的算法,如AdaGrad和RMSProp。此外,SVRAMD的趋同率恢复了现有最佳的非适应性差异降低镜面下沉算法的最好比率。我们用深层学习的实验来检查我们索赔的有效性。