Federated learning enables cooperative training among massively distributed clients by sharing their learned local model parameters. However, with increasing model size, deploying federated learning requires a large communication bandwidth, which limits its deployment in wireless networks. To address this bottleneck, we introduce a residual-based federated learning framework (ResFed), where residuals rather than model parameters are transmitted in communication networks for training. In particular, we integrate two pairs of shared predictors for the model prediction in both server-to-client and client-to-server communication. By employing a common prediction rule, both locally and globally updated models are always fully recoverable in clients and the server. We highlight that the residuals only indicate the quasi-update of a model in a single inter-round, and hence contain more dense information and have a lower entropy than the model, comparing to model weights and gradients. Based on this property, we further conduct lossy compression of the residuals by sparsification and quantization and encode them for efficient communication. The experimental evaluation shows that our ResFed needs remarkably less communication costs and achieves better accuracy by leveraging less sensitive residuals, compared to standard federated learning. For instance, to train a 4.08 MB CNN model on CIFAR-10 with 10 clients under non-independent and identically distributed (Non-IID) setting, our approach achieves a compression ratio over 700X in each communication round with minimum impact on the accuracy. To reach an accuracy of 70%, it saves around 99% of the total communication volume from 587.61 Mb to 6.79 Mb in up-streaming and to 4.61 Mb in down-streaming on average for all clients.
翻译:联邦学习使大量分布的客户能够通过分享他们所学的当地模型参数进行合作培训。然而,随着模型规模的扩大,部署联邦学习需要巨大的通信带宽,这限制了在无线网络中的部署。为了解决这一瓶颈问题,我们引入了基于残余的联邦学习框架(ResFed),在用于培训的通信网络中传递剩余而不是示范参数。特别是,我们将两对共享的预测器用于服务器对客户和客户对服务器的通信进行模型预测。通过采用共同预测规则,本地和全球更新的模型总是可以在客户和服务器中完全恢复。我们强调,剩余部分仅表明一个模型在单一的跨行间安装的准更新,因此含有比模型更密集的信息,在用于培训培训的通信中,与模型重量和梯度相比较,我们通过通缩缩放和缩放方法对剩余部分进行损失压缩,并把它们编码用于高效的通信。 实验性评估表明,我们的ResFed更新的模型在客户和服务器上总是完全可以完全回收。 我们的通信成本要大大降低,而实现精确度,通过将50MLM的每部的每部的每部的频率与每部的每部的频率进行一次的频率进行一次的频率进行一次测试,在40部的频率上进行一次的升级的计算。