With increasing concern about user data privacy, federated learning (FL) has been developed as a unique training paradigm for training machine learning models on edge devices without access to sensitive data. Traditional FL and existing methods directly employ aggregation methods on all edges of the same models and training devices for a cloud server. Although these methods protect data privacy, they are not capable of model heterogeneity, even ignore the heterogeneous computing power, and incur steep communication costs. In this paper, we purpose a resource-aware FL to aggregate an ensemble of local knowledge extracted from edge models, instead of aggregating the weights of each local model, which is then distilled into a robust global knowledge as the server model through knowledge distillation. The local model and the global knowledge are extracted into a tiny size knowledge network by deep mutual learning. Such knowledge extraction allows the edge client to deploy a resource-aware model and perform multi-model knowledge fusion while maintaining communication efficiency and model heterogeneity. Empirical results show that our approach has significantly improved over existing FL algorithms in terms of communication cost and generalization performance in heterogeneous data and models. Our approach reduces the communication cost of VGG-11 by up to 102$\times$ and ResNet-32 by up to 30$\times$ when training ResNet-20 as the knowledge network.
翻译:随着对用户数据隐私的日益关注,联谊学习(FL)已发展成为在边缘设备上培训机器学习模型的独特培训模式,无法获取敏感数据;传统FL和现有方法直接在同一个模型的所有边缘采用集成方法,为云层服务器使用培训设备;虽然这些方法保护数据隐私,但不能产生模型异质性,甚至忽视多种计算能力,并造成高昂的通信成本;在本文件中,我们要求资源认知FL汇总从边缘模型中提取的当地知识,而不是汇集每个本地模型的权重,然后通过知识蒸馏,将其转化为强大的全球知识,成为服务器模型;当地模型和全球知识通过深层相互学习,被引入一个小型知识网络;这种知识提取使边缘客户能够部署资源认知模型,并进行多模式知识融合,同时保持通信效率和模型异质性。