Knowledge distillation has caught a lot of attention in Federated Learning (FL) recently. It has the advantage for FL to train on heterogeneous clients which have different data size and data structure. However, data samples across all devices are usually not independent and identically distributed (non-i.i.d), posing additional challenges to the convergence and speed of federated learning. As FL randomly asks the clients to join the training process and each client only learns from local non-i.i.d data, which makes learning processing even slower. In order to solve this problem, an intuitive idea is using the global model to guide local training. In this paper, we propose a novel global knowledge distillation method, named FedGKD, which learns the knowledge from past global models to tackle down the local bias training problem. By learning from global knowledge and consistent with current local models, FedGKD learns a global knowledge model in FL. To demonstrate the effectiveness of the proposed method, we conduct extensive experiments on various CV datasets (CIFAR-10/100) and settings (non-i.i.d data). The evaluation results show that FedGKD outperforms previous state-of-the-art methods.
翻译:最近,联邦学习联合会(FL)大量关注了知识蒸馏工作。FL的优势在于培训具有不同数据大小和数据结构的不同客户。然而,所有设备的数据样本通常不独立,分布不均(non-i.i.d),对联合学习的趋同和速度构成额外挑战。FL随机要求客户参加培训过程,每个客户只从当地非i.i.d数据中学习,使学习处理速度更慢。为了解决这一问题,FL有一个直观的想法正在使用全球模式指导当地培训。我们在本文件中提出了一个新的全球知识蒸馏方法,名为FDGKD,从过去的全球模型中学习知识,以解决地方偏见培训问题。FL随机要求客户加入培训过程,而每个客户只从当地非i.i.d数据中学习一个全球知识模型。为了证明拟议方法的有效性,我们在各种CV数据集(CIFAR-10-100)和设置(non-i.d.d数据)上进行了广泛的实验。评估结果显示,FDG以往的方法显示FDG的状态。