This paper proposes a client selection (CS) method to tackle the communication bottleneck of federated learning (FL) while concurrently coping with FL's data heterogeneity issue. Specifically, we first analyze the effect of CS in FL and show that FL training can be accelerated by adequately choosing participants to diversify the training dataset in each round of training. Based on this, we leverage data profiling and determinantal point process (DPP) sampling techniques to develop an algorithm termed Federated Learning with DPP-based Participant Selection (FL-DP$^3$S). This algorithm effectively diversifies the participants' datasets in each round of training while preserving their data privacy. We conduct extensive experiments to examine the efficacy of our proposed method. The results show that our scheme attains a faster convergence rate, as well as a smaller communication overhead than several baselines.
翻译:本文提出一种客户端选择方法,以应对联邦学习中的通信瓶颈问题,同时处理数据异构性问题。具体而言,我们首先分析了客户端选择对联邦学习的影响,并表明适当选择参与者以使每轮训练的训练数据集多样化可以加速联邦学习训练。基于此,我们结合数据简介和确定性点过程(DPP)抽样技术,开发了一种称为基于DPP的参与者选择的联邦学习算法(FL-DP^3S),该算法在保护数据隐私的同时有效地使参与者的数据集多样化。我们进行了大量实验以考察所提出方法的效果。结果表明,相较于几个基线模型,我们的方案可以获得更快的收敛速度和更小的通信开销。