A significant bottleneck in federated learning (FL) is the network communication cost of sending model updates from client devices to the central server. We present a comprehensive empirical study of the statistics of model updates in FL, as well as the role and benefits of various compression techniques. Motivated by these observations, we propose a novel method to reduce the average communication cost, which is near-optimal in many use cases, and outperforms Top-K, DRIVE, 3LC and QSGD on Stack Overflow next-word prediction, a realistic and challenging FL benchmark. This is achieved by examining the problem using rate-distortion theory, and proposing distortion as a reliable proxy for model accuracy. Distortion can be more effectively used for optimizing the trade-off between model performance and communication cost across clients. We demonstrate empirically that in spite of the non-i.i.d. nature of federated learning, the rate-distortion frontier is consistent across datasets, optimizers, clients and training rounds.
翻译:联合学习(FL)的一个重大瓶颈是将客户设备提供的模型更新发送到中央服务器的网络通信成本。我们展示了对FL模型更新统计数据以及各种压缩技术的作用和好处的全面实证研究。根据这些观察,我们提出了一个新的方法来降低平均通信成本,在许多使用中,这种成本几乎是最佳的,并且超过了Fotter-K、Dive、3LC和QSGD在Stack-overfrolling下一个词预测方面的业绩,这是一个现实和具有挑战性的FL基准。这是通过使用率扭曲理论来审查问题,并提出扭曲作为模型准确性的可靠替代物来实现的。扭曲可以更有效地用于优化模型性能与客户之间通信成本之间的权衡。我们从经验上表明,尽管Federate-i.d.学习的性质是非i.d.,但率扭曲前沿是跨数据集、优化者、客户和培训回合的。