Federated learning is a distributed machine learning paradigm where multiple data owners (clients) collaboratively train one machine learning model while keeping data on their own devices. The heterogeneity of client datasets is one of the most important challenges of federated learning algorithms. Studies have found performance reduction with standard federated algorithms, such as FedAvg, on non-IID data. Many existing works on handling non-IID data adopt the same aggregation framework as FedAvg and focus on improving model updates either on the server side or on clients. In this work, we tackle this challenge in a different view by introducing redistribution rounds that delay the aggregation. We perform experiments on multiple tasks and show that the proposed framework significantly improves the performance on non-IID data.
翻译:联邦学习是一种分布式的机器学习模式,在这种模式中,多个数据所有者(客户)合作培训一个机器学习模式,同时将数据保存在自己的设备上。客户数据集的异质性是联合学习算法的最重要挑战之一。研究发现,非IID数据方面的标准联合算法(如FedAvg)降低了绩效。许多处理非IID数据的现有工作采用了FedAvg的汇总框架,重点是改进服务器或客户的模型更新。在这项工作中,我们从不同的角度应对这一挑战,采用重新分配回合的办法推迟汇总。我们在多项任务上进行实验,并表明拟议的框架大大改进了非IID数据的绩效。