The performance of federated learning in neural networks is generally influenced by the heterogeneity of the data distribution. For a well-performing global model, taking a weighted average of the local models, as done by most existing federated learning algorithms, may not guarantee consistency with local models in the space of neural network maps. In this paper, we propose a novel framework of federated learning equipped with the process of decentralized knowledge distillation (FedDKD) (i.e., without data on the server). The FedDKD introduces a module of decentralized knowledge distillation (DKD) to distill the knowledge of the local models to train the global model by approaching the neural network map average based on the metric of divergence defined in the loss function, other than only averaging parameters as done in literature. Numeric experiments on various heterogeneous datasets reveal that FedDKD outperforms the state-of-the-art methods with more efficient communication and training in a few DKD steps, especially on some extremely heterogeneous datasets.
翻译:神经网络中联合学习的性能一般受数据分布的异质性影响。对于一个表现良好的全球模型,如大多数现有联合学习算法所做的那样,采用当地模型的加权平均值,可能无法保证神经网络地图空间中与当地模型的一致性。在本文中,我们提议了一个具有分散式知识蒸馏过程(FedDKD)(即没有服务器上的数据)的联合会学习新框架。FDKD引入了一个分散式知识蒸馏模块(DKD),以通过接近神经网络基于损失函数中界定的差异度衡量标准,而不是仅像文献中所做的平均参数,来丰富对当地模型进行培训的知识,从而形成对当地模型的了解。关于各种混合数据集的量化实验表明,FDKDKD在几个步骤中,特别是一些极为多样化的数据集中,以效率更高的通信和培训,超越了国家技术方法。