Federated learning (FL) is a privacy-preserving machine learning paradigm in which the server periodically aggregates local model parameters from clients without assembling their private data. Constrained communication and personalization requirements pose severe challenges to FL. Federated distillation (FD) is proposed to simultaneously address the above two problems, which exchanges knowledge between the server and clients, supporting heterogeneous local models while significantly reducing communication overhead. However, most existing FD methods require a proxy dataset, which is often unavailable in reality. A few recent proxy-data-free FD approaches can eliminate the need for additional public data, but suffer from remarkable discrepancy among local knowledge due to client-side model heterogeneity, leading to ambiguous representation on the server and inevitable accuracy degradation. To tackle this issue, we propose a proxy-data-free FD algorithm based on distributed knowledge congruence (FedDKC). FedDKC leverages well-designed refinement strategies to narrow local knowledge differences into an acceptable upper bound, so as to mitigate the negative effects of knowledge incongruence. Specifically, from perspectives of peak probability and Shannon entropy of local knowledge, we design kernel-based knowledge refinement (KKR) and searching-based knowledge refinement (SKR) respectively, and theoretically guarantee that the refined-local knowledge can satisfy an approximately-similar distribution and be regarded as congruent. Extensive experiments conducted on three common datasets demonstrate that our proposed FedDKC significantly outperforms the state-of-the-art (Top-1 accuracy boosts by up to 4.38%, and Top-5 accuracy boosts by up to 10.31%) on various heterogeneous settings while evidently improving the convergence speed.
翻译:联邦学习(FL)是一种保护隐私的机器学习模式,服务器在其中定期从客户收集本地模型参数,而无需收集私人数据。 限制通信和个人化的要求给FL带来了严峻的挑战。 联邦蒸馏(FD)建议同时解决上述两个问题,这两个问题在服务器和客户之间交流知识,支持多样化的本地模型,同时大幅减少通信间接费用。然而,大多数现有的FD方法需要一个代理数据集,这在现实中往往无法得到。最近一些没有代理数据的FD方法可以消除对额外公共数据的需求,但由于客户方模型的异质性化,当地知识存在显著的差异,导致客户方模型的异质性化,导致服务器上代表的模糊性以及不可避免的准确性退化。为了解决这一问题,我们建议基于分布式知识一致性(FedKC)进行无代理数据的FD算法。FDKC利用精心设计的精细化战略将本地知识差异缩小到一个可以接受的上限,从而减轻知识的负面趋异性。具体地说,我们对本地知识的高度概率和香化(我们设计了一个不断升级的不断升级的KK) 和不断改进的流化的知识。