Federated learning enables a cluster of decentralized mobile devices at the edge to collaboratively train a shared machine learning model, while keeping all the raw training samples on device. This decentralized training approach is demonstrated as a practical solution to mitigate the risk of privacy leakage. However, enabling efficient FL deployment at the edge is challenging because of non-IID training data distribution, wide system heterogeneity and stochastic-varying runtime effects in the field. This paper jointly optimizes time-to-convergence and energy efficiency of state-of-the-art FL use cases by taking into account the stochastic nature of edge execution. We propose AutoFL by tailor-designing a reinforcement learning algorithm that learns and determines which K participant devices and per-device execution targets for each FL model aggregation round in the presence of stochastic runtime variance, system and data heterogeneity. By considering the unique characteristics of FL edge deployment judiciously, AutoFL achieves 3.6 times faster model convergence time and 4.7 and 5.2 times higher energy efficiency for local clients and globally over the cluster of K participants, respectively.
翻译:联邦学习使边缘的一组分散式移动设备能够合作培训一个共享机器学习模式,同时保留所有原始培训样本。这种分散式培训方法被证明是减轻隐私泄漏风险的实用解决方案。然而,由于非二二维培训数据分布、广泛的系统差异性和随机变化的实地运行时间效应,使得在边缘有效部署FL具有挑战性。本文件考虑到边缘执行的随机性,联合优化了最先进的FL使用案例的时间对时间的调和和能源效率。我们建议AutoFLL通过定制设计强化学习算法,在存在随机运行时间差异、系统和数据差异的情况下,学习并确定每个FL模型组合的K参与装置和每个执行目标。考虑到FL边缘部署的独特性,AutoFLL通过对本地客户和全球参与者的分组,分别实现3.6倍的快速模型融合时间和4.7和5.2倍的更高能效。