We develop a general framework unifying several gradient-based stochastic optimization methods for empirical risk minimization problems both in centralized and distributed scenarios. The framework hinges on the introduction of an augmented graph consisting of nodes modeling the samples and edges modeling both the inter-device communication and intra-device stochastic gradient computation. By designing properly the topology of the augmented graph, we are able to recover as special cases the renowned Local-SGD and DSGD algorithms, and provide a unified perspective for variance-reduction (VR) and gradient-tracking (GT) methods such as SAGA, Local-SVRG and GT-SAGA. We also provide a unified convergence analysis for smooth and (strongly) convex objectives relying on a proper structured Lyapunov function, and the obtained rate can recover the best known results for many existing algorithms. The rate results further reveal that VR and GT methods can effectively eliminate data heterogeneity within and across devices, respectively, enabling the exact convergence of the algorithm to the optimal solution. Numerical experiments confirm the findings in this paper.
翻译:我们为集中和分布式情景中的风险最小化经验问题制定了一个统一若干梯度优化方法的总框架,该框架取决于采用一个扩大的图表,其中包括对样品和边缘进行模型化的节点模型,这些样品和边缘模型同时对设备间通信和内部切分梯度进行模型化计算。通过适当设计扩大图的地形学,我们能够作为特例恢复著名的当地SGD和DSGD算法,并为差异减少(VR)和梯度跟踪(GT)方法提供统一的观点,如SAGA、当地-SVRG和GT-SAGA等。我们还为光滑动和(强力)二次曲线目标提供了统一的趋同分析,以适当的结构Lyapunov函数为基础,而所获得的比率可以恢复许多现有算法的最佳已知结果。比率进一步显示,VR和GT方法可以有效地消除各种装置内和跨装置内的数据遗传性,从而能够使算法与最佳解决办法完全一致。数字实验证实了本文件中的调查结果。