Federated learning (FL) facilitates edge devices to cooperatively train a global shared model while maintaining the training data locally and privately. However, a common but impractical assumption in FL is that the participating edge devices possess the same required resources and share identical global model architecture. In this study, we propose a novel FL method called Federated Intermediate Layers Learning (FedIN), supporting heterogeneous models without utilizing any public dataset. The training models in FedIN are divided into three parts, including an extractor, the intermediate layers, and a classifier. The model architectures of the extractor and classifier are the same in all devices to maintain the consistency of the intermediate layer features, while the architectures of the intermediate layers can vary for heterogeneous devices according to their resource capacities. To exploit the knowledge from features, we propose IN training, training the intermediate layers in line with the features from other clients. Additionally, we formulate and solve a convex optimization problem to mitigate the gradient divergence problem induced by the conflicts between the IN training and the local training. The experiment results show that FedIN achieves the best performance in the heterogeneous model environment compared with the state-of-the-art algorithms. Furthermore, our ablation study demonstrates the effectiveness of IN training and the solution to the convex optimization problem.
翻译:联邦学习(FL)可以让边缘设备在保持本地和私有化训练数据的前提下,协同训练全局共享模型。然而,FL 中一个常见但不切实际的假设是参与的边缘设备拥有相同的所需资源,并共享相同的全局模型结构。本文提出了一种名为联邦中间层学习的新型FL方法(FedIN),支持异构模型,而不需要利用任何公共数据集。FedIN 训练模型分为三个部分,分别是特征提取器、中间层和分类器。特征提取器和分类器的模型结构在所有设备中都相同,以维护中间层特征的一致性,而中间层的模型结构可以根据异构设备的资源性能而有所不同。我们提出了 IN 训练,从特征中利用知识,使中间层与其他客户端的特征保持一致。此外,我们制定和解决一个凸优化问题,以缓解因 IN 训练与本地训练之间的冲突而引起的梯度发散问题。实验结果表明,与现有算法相比,FedIN 在异构模型环境中取得了最佳性能。此外,我们的消融研究证明了 IN 训练和解决凸优化问题的有效性。