The heterogeneity across devices usually hinders the optimization convergence and generalization performance of federated learning (FL) when the aggregation of devices' knowledge occurs in the gradient space. For example, devices may differ in terms of data distribution, network latency, input/output space, and/or model architecture, which can easily lead to the misalignment of their local gradients. To improve the tolerance to heterogeneity, we propose a novel federated prototype learning (FedProto) framework in which the devices and server communicate the class prototypes instead of the gradients. FedProto aggregates the local prototypes collected from different devices, and then sends the global prototypes back to all devices to regularize the training of local models. The training on each device aims to minimize the classification error on the local data while keeping the resulting local prototypes sufficiently close to the corresponding global ones. Through experiments, we propose a benchmark setting tailored for heterogeneous FL, with FedProto outperforming several recent FL approaches on multiple datasets.
翻译:各种装置的异质性通常会妨碍联合学习(FL)在梯度空间汇集设备知识时,联合学习(FL)的优化趋同和普及性效果。例如,在数据分布、网络悬浮度、输入/输出空间和/或模型结构方面,装置可能有所不同,很容易导致本地梯度的错配。为了改善对异质性的容忍度,我们提议了一个新型的FedProto(FedProto)原型(FedProto)框架,在其中,装置和服务器传递类原型,而不是梯度。FedProto综合了从不同装置收集的本地原型,然后将全球原型发回到所有装置,以便对本地模型进行培训。关于每种装置的培训旨在尽量减少当地数据分类错误,同时使由此产生的本地原型与相应的全球原型保持足够接近。我们通过实验,建议一个基准,为多种FedProto(FedProto)在多个数据集上反映最近的FL方法。