In most of the literature on federated learning (FL), neural networks are initialized with random weights. In this paper, we present an empirical study on the effect of pre-training on FL. Specifically, we aim to investigate if pre-training can alleviate the drastic accuracy drop when clients' decentralized data are non-IID. We focus on FedAvg, the fundamental and most widely used FL algorithm. We found that pre-training does largely close the gap between FedAvg and centralized learning under non-IID data, but this does not come from alleviating the well-known model drifting problem in FedAvg's local training. Instead, how pre-training helps FedAvg is by making FedAvg's global aggregation more stable. When pre-training using real data is not feasible for FL, we propose a novel approach to pre-train with synthetic data. On various image datasets (including one for segmentation), our approach with synthetic pre-training leads to a notable gain, essentially a critical step toward scaling up federated learning for real-world applications.
翻译:在大多数关于联合学习的文献中,神经网络是随机加权的。在本文中,我们介绍了关于培训前训练对FL的影响的经验性研究。具体地说,我们的目标是调查培训前训练是否能够减轻客户分散的数据非IID时的精确度急剧下降。我们侧重于FedAvg, 即基本和最广泛使用的FL算法。我们发现,培训前基本上缩小了FedAvg与非IID数据下的集中学习之间的差距,但这并非来自缓解FedAvg当地培训中众所周知的模型漂移问题。相反,培训前如何帮助FedAvg使FedAvg的全球汇总更加稳定。当使用真实数据进行培训前训练对FL不可行时,我们建议采用新的方法对合成数据进行预先培训。在各种图像数据集(包括分解)上,我们采用合成培训前学习的方法可以带来显著的收益,基本上是一个关键步骤,即逐步扩大用于现实世界应用的节化学习。