We consider the problem of decentralized deep learning where multiple agents collaborate to learn from a distributed dataset. While there exist several decentralized deep learning approaches, the majority consider a central parameter-server topology for aggregating the model parameters from the agents. However, such a topology may be inapplicable in networked systems such as ad-hoc mobile networks, field robotics, and power network systems where direct communication with the central parameter server may be inefficient. In this context, we propose and analyze a novel decentralized deep learning algorithm where the agents interact over a fixed communication topology (without a central server). Our algorithm is based on the heavy-ball acceleration method used in gradient-based optimization. We propose a novel consensus protocol where each agent shares with its neighbors its model parameters as well as gradient-momentum values during the optimization process. We consider both strongly convex and non-convex objective functions and theoretically analyze our algorithm's performance. We present several empirical comparisons with competing decentralized learning methods to demonstrate the efficacy of our approach under different communication topologies.
翻译:我们考虑了分散的深层次学习问题,即多个代理商合作从分布式数据集中学习。虽然存在一些分散的深层次学习方法,但多数人认为是集成代理商模型参数的中央参数-服务器表层学。然而,这种表层学可能不适用于网络化系统,如特别热移动网络、野外机器人和动力网络系统,与中央参数服务器的直接通信可能效率不高。在这方面,我们提议并分析一种新的分散式深层次学习算法,使代理商通过固定通信表层(没有中央服务器)进行互动。我们的算法基于在梯度优化中使用的重球加速法。我们提出了一个新的共识协议,让每个代理商在优化过程中与其邻居共享其模型参数以及梯度-momentum值。我们既考虑强烈的 convex又考虑非Convex客观功能,也从理论上分析我们的算法的绩效。我们提出了几项经验性比较,与相互竞争的分散式学习方法进行比较,以显示我们在不同通信结构下的方法的有效性。