Emerging applications in multi-agent environments such as internet-of-things, networked sensing, autonomous systems and federated learning, call for decentralized algorithms for finite-sum optimizations that are resource-efficient in terms of both computation and communication. In this paper, we consider the prototypical setting where the agents work collaboratively to minimize the sum of local loss functions by only communicating with their neighbors over a predetermined network topology. We develop a new algorithm, called DEcentralized STochastic REcurSive gradient methodS (DESTRESS) for nonconvex finite-sum optimization, which matches the optimal incremental first-order oracle (IFO) complexity of centralized algorithms for finding first-order stationary points, while maintaining communication efficiency. Detailed theoretical and numerical comparisons corroborate that the resource efficiencies of DESTRESS improve upon prior decentralized algorithms over a wide range of parameter regimes. DESTRESS leverages several key algorithm design ideas including stochastic recursive gradient updates with mini-batches for local computation, gradient tracking with extra mixing (i.e., multiple gossiping rounds) for per-iteration communication, together with careful choices of hyper-parameters and new analysis frameworks to provably achieve a desirable computation-communication trade-off.
翻译:在多种试剂环境中新出现的应用,如互联网、网络遥感、自主系统和联合学习等,要求采用分散式算法,实现在计算和通信方面资源效率高的有限和优化,在计算和通信方面要求采用资源效率高的有限和有限优化。在本文件中,我们考虑了各种代理机构开展合作的原型环境,通过在事先确定的网络地形学上与邻居沟通,最大限度地减少当地损失功能的总和。我们开发了一种新的算法,称为分散式Stechanic Recurive梯度方法(DESTRANSS),用于非convex 有限和最优化,这与寻找一级固定点的集中算法的最佳增量一级或级复杂(IFO)相匹配,同时保持通信效率。详细的理论和数字比较证明,DESTRANSS的资源效率通过在一系列广泛的参数系统中与先前分散的算法提高。DESTRANSS利用了几种关键的算法设计构想,包括由本地计算和额外混合的微调梯度递增梯度梯度更新,用梯度追踪(e. 多个流论轮),同时实现理想通信通信的精确度分析,同时实现超度的超度分析。