In federated learning, a strong global model is collaboratively learned by aggregating clients' locally trained models. Although this precludes the need to access clients' data directly, the global model's convergence often suffers from data heterogeneity. This study starts from an analogy to continual learning and suggests that forgetting could be the bottleneck of federated learning. We observe that the global model forgets the knowledge from previous rounds, and the local training induces forgetting the knowledge outside of the local distribution. Based on our findings, we hypothesize that tackling down forgetting will relieve the data heterogeneity problem. To this end, we propose a novel and effective algorithm, Federated Not-True Distillation (FedNTD), which preserves the global perspective on locally available data only for the not-true classes. In the experiments, FedNTD shows state-of-the-art performance on various setups without compromising data privacy or incurring additional communication costs.
翻译:在联合学习中,通过汇集客户在当地培训的模型,合作学习了一个强大的全球模型。虽然这排除了直接获取客户数据的必要性,但全球模型的趋同往往有数据异质性。本研究从类比开始到持续学习,表明忘记可能是联合会学习的瓶颈。我们观察到,全球模型忘记了前几轮的知识,而当地培训又导致忘记当地传播以外的知识。根据我们的调查结果,我们假设,解决忘记问题会减轻数据异质性问题。为此,我们提出一种新颖而有效的算法,即非透明蒸馏(Fedededed- Not-Trade distruction (Fed-Trade-Trade Developation),该算作只为不正确的班级保留对本地可用数据的全球视角。在实验中,FedNTD在不损及数据隐私或增加通信成本的情况下展示了各种设置的最新表现。