In contrast to training traditional machine learning (ML) models in data centers, federated learning (FL) trains ML models over local datasets contained on resource-constrained heterogeneous edge devices. Existing FL algorithms aim to learn a single global model for all participating devices, which may not be helpful to all devices participating in the training due to the heterogeneity of the data across the devices. Recently, Hanzely and Richt\'{a}rik (2020) proposed a new formulation for training personalized FL models aimed at balancing the trade-off between the traditional global model and the local models that could be trained by individual devices using their private data only. They derived a new algorithm, called Loopless Gradient Descent (L2GD), to solve it and showed that this algorithms leads to improved communication complexity guarantees in regimes when more personalization is required. In this paper, we equip their L2GD algorithm with a bidirectional compression mechanism to further reduce the communication bottleneck between the local devices and the server. Unlike other compression-based algorithms used in the FL-setting, our compressed L2GD algorithm operates on a probabilistic communication protocol, where communication does not happen on a fixed schedule. Moreover, our compressed L2GD algorithm maintains a similar convergence rate as vanilla SGD without compression. To empirically validate the efficiency of our algorithm, we perform diverse numerical experiments on both convex and non-convex problems and using various compression techniques.
翻译:与在数据中心培训传统机器学习(ML)模型相比,联邦学习(FL)培训ML模型比在资源限制的多元边缘设备上包含的当地数据集来培训ML模型。现有的FL算法的目的是为所有参与设备学习单一的全球模型,这可能无助于所有参加培训的装置,因为所有装置的数据都不同。最近,Hanzely和Richt\\\{a}rik (202020年)提出了一个新的公式,用于培训个人化的FL模型,目的是平衡传统全球模型与仅由个人设备用其私人数据培训的当地模型之间的取舍。它们产生了一种新的算法,称为“无压的重力源源(L2GD)”,以解决这个问题,并表明这种算法导致在系统更加个性化的情况下,对参与培训的所有装置的通信复杂性提供更好的保障。最近,Hanzely和Richt\\\\\{a}a(202020年),我们用双向压缩的压缩机制来进一步减少当地装置和服务器之间的通信瓶颈。不同于在FL设置中使用的其他压缩的基于压缩的算法,我们使用非L2GGDGDG的不使用非的算法的算方法,在不使用一种固定的递校正的逻辑上,在一种固定的逻辑化的逻辑上,在一种固定的逻辑上运行的逻辑上运行的逻辑上运行。