This paper proposes a distributed stochastic algorithm with variance reduction for general smooth non-convex finite-sum optimization, which has wide applications in signal processing and machine learning communities. In distributed setting, large number of samples are allocated to multiple agents in the network. Each agent computes local stochastic gradient and communicates with its neighbors to seek for the global optimum. In this paper, we develop a modified variance reduction technique to deal with the variance introduced by stochastic gradients. Combining gradient tracking and variance reduction techniques, this paper proposes a distributed stochastic algorithm, GT-VR, to solve large-scale non-convex finite-sum optimization over multi-agent networks. A complete and rigorous proof shows that the GT-VR algorithm converges to first-order stationary points with $O(\frac{1}{k})$ convergence rate. In addition, we provide the complexity analysis of the proposed algorithm. Compared with some existing first-order methods, the proposed algorithm has a lower $\mathcal{O}(PM\epsilon^{-1})$ gradient complexity under some mild condition. By comparing state-of-the-art algorithms and GT-VR in experimental simulations, we verify the efficiency of the proposed algorithm.
翻译:本文建议使用分布式随机算法,减少一般平滑的非convex 有限和优化的差价,这种算法在信号处理和机器学习社区中具有广泛的应用。 在分布式环境中,大量样本分配给网络中的多个代理商。 每个代理商计算本地随机梯度,并与邻居进行通信,以寻求全球最佳效果。 在本文中, 我们开发了经过修改的减少差异技术, 以应对随机梯度引入的差异。 结合了梯度跟踪和减少差异技术, 本文建议采用分布式随机算法, GT- VR, 以解决多试剂网络上的大规模非convex 有限和优化。 完整而严格的证据表明, GT- VR 算法与一阶固定点相融合, 以美元( afrac{1 ⁇ k} k} ) 最佳全球最佳效果。 此外, 我们提供了对拟议算法的复杂性分析。 与一些现有的一级方法相比, 拟议的算法在某种温度的模拟状态下, 在某种低度的GGT- GLA 中, 和提议的梯度精度精度分析法中, 。