使用延迟衍生物的分布式随机惯性方法 (Distributed stochastic inertial methods with delayed derivatives)

Stochastic gradient methods (SGMs) are predominant approaches for solving stochastic optimization. On smooth nonconvex problems, a few acceleration techniques have been applied to improve the convergence rate of SGMs. However, little exploration has been made on applying a certain acceleration technique to a stochastic subgradient method (SsGM) for nonsmooth nonconvex problems. In addition, few efforts have been made to analyze an (accelerated) SsGM with delayed derivatives. The information delay naturally happens in a distributed system, where computing workers do not coordinate with each other. In this paper, we propose an inertial proximal SsGM for solving nonsmooth nonconvex stochastic optimization problems. The proposed method can have guaranteed convergence even with delayed derivative information in a distributed environment. Convergence rate results are established to three classes of nonconvex problems: weakly-convex nonsmooth problems with a convex regularizer, composite nonconvex problems with a nonsmooth convex regularizer, and smooth nonconvex problems. For each problem class, the convergence rate is $O(1/K^{\frac{1}{2}})$ in the expected value of the gradient norm square, for $K$ iterations. In a distributed environment, the convergence rate of the proposed method will be slowed down by the information delay. Nevertheless, the slow-down effect will decay with the number of iterations for the latter two problem classes. We test the proposed method on three applications. The numerical results clearly demonstrate the advantages of using the inertial-based acceleration. Furthermore, we observe higher parallelization speed-up in asynchronous updates over the synchronous counterpart, though the former uses delayed derivatives.

翻译：沙变梯度方法( SGM ) 是解决沙发优化的主要方法。在平滑的非混凝土问题上, 运用了一些加速技术来提高 SGM 的趋同率。但是, 在对非摩擦的非混凝土问题应用某种加速技术( SSGM ) 亚梯度方法( SSGM ) 时, 很少探索。此外, 分析一个( 加速的) 带有延迟衍生物的 SGM 方法( SGM ) 。信息延迟自然发生在一个分布式系统中, 计算机工人不相互协调。在本文中, 我们建议采用惯性快速化的SsgMLSGM, 来解决非moot的非相交点问题。但是, 所提议的方法即使在一个分布式环境中延迟的衍生物信息, 也能够保证某种加速技术的趋同性。前一种不协调性的问题, 前一种是慢化的不移动性的问题, 一种是非模拟的对调化的对调和不协调的。。在每种问题中, 渐变压的递化的递增的递增法, 将显示水平。