Federated Learning (FL) has shown great potential as a privacy-preserving solution to learning from decentralized data that are only accessible to end devices (i.e., clients). In many scenarios however, a large proportion of the clients are probably in possession of low-quality data that are biased, noisy or even irrelevant. As a result, they could significantly slow down the convergence of the global model we aim to build and also compromise its quality. In light of this, we propose FedProf, a novel algorithm for optimizing FL under such circumstances without breaching data privacy. The key of our approach is a data representation profiling and matching scheme that uses the global model to dynamically profile data representations and allows for low-cost, lightweight representation matching. Based on the scheme we adaptively score each client and adjust its participation probability so as to mitigate the impact of low-value clients on the training process. We have conducted extensive experiments on public datasets using various FL settings. The results show that FedProf effectively reduces the number of communication rounds and overall time (up to 4.5x speedup) for the global model to converge and provides accuracy gain.
翻译:联邦学习联合会(FL)已经展示出巨大的保护隐私的解决方案,从仅能为终端设备(即客户)获取的分散数据中学习。然而,在许多情形中,很大一部分客户可能拥有偏差、吵闹甚至无关紧要的低质量数据。因此,他们可以大大减缓我们所要建立的全球模式的趋同,并损害其质量。有鉴于此,我们提议FedProf(FedProf),这是一种在这种情形下在不破坏数据隐私的情况下优化FL的新型算法。我们的方法的关键是数据代表性分析和匹配计划,利用全球模型动态地描述数据表示,并允许低成本、轻量度的表示匹配。基于我们适应性地给每个客户评分并调整其参与概率的计划,以减轻低价值客户对培训进程的影响。我们利用各种FL环境对公共数据集进行了广泛的实验。结果显示,FedProf(FedProf)有效地减少了通信周期的数量和总体时间(达4.5x速度),以便全球模型汇集和提供准确收益。