Emerging applications in multi-agent environments such as internet-of-things, networked sensing, autonomous systems and federated learning, call for decentralized algorithms for finite-sum optimizations that are resource-efficient in terms of both computation and communication. In this paper, we consider the prototypical setting where the agents work collaboratively to minimize the sum of local loss functions by only communicating with their neighbors over a predetermined network topology. We develop a new algorithm, called DEcentralized STochastic REcurSive gradient methodS (DESTRESS) for nonconvex finite-sum optimization, which matches the optimal incremental first-order oracle (IFO) complexity of centralized algorithms for finding first-order stationary points, while maintaining communication efficiency. Detailed theoretical and numerical comparisons corroborate that the resource efficiencies of DESTRESS improve upon prior decentralized algorithms over a wide range of parameter regimes. DESTRESS leverages several key algorithm design ideas including randomly activated stochastic recursive gradient updates with mini-batches for local computation, gradient tracking with extra mixing (i.e., multiple gossiping rounds) for per-iteration communication, together with careful choices of hyper-parameters and new analysis frameworks to provably achieve a desirable computation-communication trade-off.
翻译:多试剂环境中新出现的应用,如互联网、网络遥感、自主系统和联合学习等,要求为计算和通信方面资源效率高的有限和优化而采用分散式算法。在本文件中,我们考虑了各种代理机构通过在事先确定的网络地形学上与邻居沟通,协作尽量减少当地损失功能总量的原型环境。我们开发了一种新的算法,称为分散式Stechastic RecurSive梯度方法S(DESTRSESS),用于非convelx定额和优化,与寻找一级固定点、同时保持通信效率的集中算法的最佳递增第一阶或最高端(IFO)复杂性相匹配。详细的理论和数字比较证实,DESTRESS的资源效率通过在一系列广泛的参数系统中与先前的分散式算法提高。DESTRESTERSS利用了几种关键的算法设计构想,包括随机激活的随机性随机性再生变现性递增梯度梯度更新,与本地计算、递增混合的梯度跟踪(i.e.、多流流化轮)等中央算法复杂级算法,同时实现理想的通信框架,并进行认真的超度分析。