In federated learning, a large number of users are involved in a global learning task, in a collaborative way. They alternate local computations and two-way communication with a distant orchestrating server. Communication, which can be slow and costly, is the main bottleneck in this setting. To reduce the communication load and therefore accelerate distributed gradient descent, two strategies are popular: 1) communicate less frequently; that is, perform several iterations of local computations between the communication rounds; and 2) communicate compressed information instead of full-dimensional vectors. We propose the first algorithm for distributed optimization and federated learning, which harnesses these two strategies jointly and converges linearly to an exact solution in the strongly convex setting, with a doubly accelerated rate: our algorithm benefits from the two acceleration mechanisms provided by local training and compression, namely a better dependency on the condition number of the functions and on the dimension of the model, respectively.
翻译:在联合学习中,大量用户以协作方式参与全球学习任务,与遥远的管弦服务器互换本地计算和双向通信。通信可能是这一环境的主要瓶颈,可能是缓慢和昂贵的。为了减少通信负荷,从而加快分布梯度下降速度,有两个战略很受欢迎:(1) 通信频率较低,即对通信周期进行若干次地方计算;(2) 通信压缩信息,而不是全维矢量。我们提出了分配优化和联合学习的第一个算法。 我们提出了分配优化和联合学习的第一个算法。 这两种算法共同利用这两种战略,以双速速速度将线性组合到非常紧凑的组合中的确切解决方案:我们算法从当地培训和压缩提供的两种加速机制中获益,即分别更多地依赖功能的条件和模型的维度。