Federated learning (FL) is a privacy-preserving paradigm where multiple participants jointly solve a machine learning problem without sharing raw data. Unlike traditional distributed learning, a unique characteristic of FL is statistical heterogeneity, namely, data distributions across participants are different from each other. Meanwhile, recent advances in the interpretation of neural networks have seen a wide use of neural tangent kernels (NTKs) for convergence analyses. In this paper, we propose a novel FL paradigm empowered by the NTK framework. The paradigm addresses the challenge of statistical heterogeneity by transmitting update data that are more expressive than those of the conventional FL paradigms. Specifically, sample-wise Jacobian matrices, rather than model weights/gradients, are uploaded by participants. The server then constructs an empirical kernel matrix to update a global model without explicitly performing gradient descent. We further develop a variant with improved communication efficiency and enhanced privacy. Numerical results show that the proposed paradigm can achieve the same accuracy while reducing the number of communication rounds by an order of magnitude compared to federated averaging.
翻译:联邦学习(FL)是一种隐私保护模式,许多参与者共同解决机器学习问题而不分享原始数据。与传统的分布式学习不同,FL的一个独特特点是统计差异性,即参与者之间的数据分布是不同的。与此同时,神经网络的最新解释进展显示,神经网络广泛使用神经红心内核(NTKs)进行趋同分析。在本文中,我们提出了一个新的FL模式,由NTK框架授权。该模式通过传送比传统FL模式更清晰的更新数据来解决统计差异性的挑战。具体地说,参与者上传了抽样的Jacobian矩阵,而不是模型重量/梯度。服务器随后构建了一个实验性内核矩阵,以更新全球模型,而不明确体现梯度下降。我们进一步开发了一个提高通信效率和增强隐私的变式。数字结果显示,拟议的模式可以达到同样的准确性,同时减少通信轮次的数量,使之与平均度相比,以数量顺序减少通信轮数。