Federated learning (FL) is an emerging machine learning (ML) paradigm that enables heterogeneous edge devices to collaboratively train ML models without revealing their raw data to a logically centralized server. Heterogeneity across participants is a fundamental challenge in FL, both in terms of non-independent and identically distributed (Non-IID) data distributions and variations in device capabilities. Many existing works present point solutions to address issues like slow convergence, low final accuracy, and bias in FL, all stemming from the client heterogeneity. We observe that, in a large population, there exist groups of clients with statistically similar data distributions (cohorts). In this paper, we propose Auxo to gradually identify cohorts among large-scale, low-participation, and resource-constrained FL populations. Auxo then adaptively determines how to train cohort-specific models in order to achieve better model performance and ensure resource efficiency. By identifying cohorts with smaller heterogeneity and performing efficient cohort-based training, our extensive evaluations show that Auxo substantially boosts the state-of-the-art solutions in terms of final accuracy, convergence time, and model bias.
翻译:联邦学习(FL)是一个新兴的机器学习(ML)模式,它使多种边缘设备能够合作培训ML模型,而没有向逻辑集中的服务器披露原始数据。参与者的异质性是FL的一个基本挑战,无论是非独立和同样分布(非IID)数据分布还是设备能力的变化。许多现有工作都提出了解决FL中缓慢趋同、最终精确度低和偏差等问题的点解决办法,所有这些问题都是由客户差异性造成的。我们注意到,在大量人口里,存在着具有统计上类似数据分布(cohorts)的客户群体。在本文件中,我们提议Auxo在大规模、低参与率和资源上受限制的FL组群中逐步确定组群。然后,Auxo将适应性地决定如何培训特定组群模型,以便实现更好的模型性能和确保资源效率。我们的广泛评估表明,通过识别具有较小异性组群的组群和高效的组群化培训,我们的广泛评估表明,Auxo在最终准确性、时间和偏差模型方面大大推进了状态的解决方案。