ResFed:通过转递深压缩残余物,传播高效的联邦学习 (ResFed: Communication Efficient Federated Learning by Transmitting Deep Compressed Residuals)

Federated learning enables cooperative training among massively distributed clients by sharing their learned local model parameters. However, with increasing model size, deploying federated learning requires a large communication bandwidth, which limits its deployment in wireless networks. To address this bottleneck, we introduce a residual-based federated learning framework (ResFed), where residuals rather than model parameters are transmitted in communication networks for training. In particular, we integrate two pairs of shared predictors for the model prediction in both server-to-client and client-to-server communication. By employing a common prediction rule, both locally and globally updated models are always fully recoverable in clients and the server. We highlight that the residuals only indicate the quasi-update of a model in a single inter-round, and hence contain more dense information and have a lower entropy than the model, comparing to model weights and gradients. Based on this property, we further conduct lossy compression of the residuals by sparsification and quantization and encode them for efficient communication. The experimental evaluation shows that our ResFed needs remarkably less communication costs and achieves better accuracy by leveraging less sensitive residuals, compared to standard federated learning. For instance, to train a 4.08 MB CNN model on CIFAR-10 with 10 clients under non-independent and identically distributed (Non-IID) setting, our approach achieves a compression ratio over 700X in each communication round with minimum impact on the accuracy. To reach an accuracy of 70%, it saves around 99% of the total communication volume from 587.61 Mb to 6.79 Mb in up-streaming and to 4.61 Mb in down-streaming on average for all clients.

翻译：联邦学习使大量分布的客户能够通过分享他们所学的当地模型参数进行合作培训。然而,随着模型规模的扩大,部署联邦学习需要巨大的通信带宽,这限制了在无线网络中的部署。为了解决这一瓶颈问题,我们引入了基于残余的联邦学习框架(ResFed),在用于培训的通信网络中传递剩余而不是示范参数。特别是,我们将两对共享的预测器用于服务器对客户和客户对服务器的通信进行模型预测。通过采用共同预测规则,本地和全球更新的模型总是可以在客户和服务器中完全恢复。我们强调,剩余部分仅表明一个模型在单一的跨行间安装的准更新,因此含有比模型更密集的信息,在用于培训培训的通信中,与模型重量和梯度相比较,我们通过通缩缩放和缩放方法对剩余部分进行损失压缩,并把它们编码用于高效的通信。实验性评估表明,我们的ResFed更新的模型在客户和服务器上总是完全可以完全回收。我们的通信成本要大大降低,而实现精确度,通过将50MLM的每部的每部的每部的频率与每部的每部的频率进行一次的频率进行一次的频率进行一次测试,在40部的频率上进行一次的升级的计算。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日