Federated learning is a rapidly-growing area of research which enables a large number of clients to jointly train a machine learning model on privately-held data. One of the largest barriers to wider adoption of federated learning is the communication cost of sending model updates from and to the clients, which is accentuated by the fact that many of these devices are bandwidth-constrained. In this paper, we aim to address this issue by optimizing networks within a subspace of their full parameter space, an idea known as intrinsic dimension in the machine learning theory community. We use a correspondence between the notion of intrinsic dimension and gradient compressibility to derive a family of low-bandwidth optimization algorithms, which we call intrinsic gradient compression algorithms. Specifically, we present three algorithms in this family with different levels of upload and download bandwidth for use in various federated settings, along with theoretical guarantees on their performance. Finally, in large-scale federated learning experiments with models containing up to 100M parameters, we show that our algorithms perform extremely well compared to current state-of-the-art gradient compression methods.
翻译:联邦学习是一个迅速增长的研究领域,它使大量客户能够联合培训关于私人持有的数据的机器学习模型。对于更广泛地采用联盟式学习来说,最大的障碍之一是向客户发送模型更新的通信成本,而许多这些设备都是带宽受限制的,这一事实更突出了这一点。在本文中,我们的目标是通过优化其全部参数空间的子空间内的网络来解决这一问题,这种空间被称为机器学习理论界的内在维度概念。我们利用内在维度概念和梯度压缩性能概念之间的对应来形成一个低带宽优化算法,我们称之为内在梯度压缩算法。具体地说,我们在这个家庭提出了三种具有不同水平上传和下载带宽的算法,用于各种联邦化环境中,同时提供其性能的理论保证。最后,在包含100M参数的模型的大规模节能学习实验中,我们显示我们的算法与目前最先进的梯度缩缩压方法相比效果极好。