In this work, we investigate the idea of variance reduction by studying its properties with general adaptive mirror descent algorithms in nonsmooth nonconvex finite-sum optimization problems. We propose a simple yet generalized framework for variance reduced adaptive mirror descent algorithms named SVRAMD and provide its convergence analysis in both the nonsmooth nonconvex problem and the P-L conditioned problem. We prove that variance reduction reduces the SFO complexity of adaptive mirror descent algorithms and thus accelerates their convergence. In particular, our general theory implies that variance reduction can be applied to algorithms using time-varying step sizes and self-adaptive algorithms such as AdaGrad and RMSProp. Moreover, the convergence rates of SVRAMD recover the best existing rates of non-adaptive variance reduced mirror descent algorithms without complicated algorithmic components. Extensive experiments in deep learning validate our theoretical findings.
翻译:在这项工作中,我们通过研究差异减少的概念,研究其特性,在非单向非相向非相向的有限和优化问题中,以一般的适应性镜底降序算法来研究差异减少的概念。我们提议了一个简单而普遍的框架,用于差异减少的适应性镜底降序算法,名为SVRAMD, 并对非偏向非相向非和P-L条件性问题进行趋同分析。我们证明差异减少会降低适应性镜底降序算法的SFO复杂性,从而加速其趋同。特别是,我们的一般理论表明,差异减少可以适用于使用时间变化的步数和自我适应性算法的算法,如AdaGrad和RMSProp。此外,SVRAMD的趋同率恢复了现有最佳的非适应性差异降低镜底降序算法的速率,而没有复杂的算法组成部分。深入学习的实验证实了我们的理论结论。