Federated learning (FL) is a promising technique that enables many edge devices to train a machine learning model collaboratively in wireless networks. By exploiting the superposition nature of wireless waveforms, over-the-air computation (AirComp) can accelerate model aggregation and hence facilitate communication-efficient FL. Due to channel fading, power control is crucial in AirComp. Prior works assume that the signals to be aggregated from each device, i.e., local gradients have identical statistics. In FL, however, gradient statistics vary over both training iterations and feature dimensions, and are unknown in advance. This paper studies the power control problem for over-the-air FL by taking gradient statistics into account. The goal is to minimize the aggregation error by optimizing the transmit power at each device subject to peak power constraints. We obtain the optimal policy in closed form when gradient statistics are given. Notably, we show that the optimal transmit power is continuous and monotonically decreases with the squared multivariate coefficient of variation (SMCV) of gradient vectors. We then propose a method to estimate gradient statistics with negligible communication cost. Experimental results demonstrate that the proposed gradient-statistics-aware power control achieves higher test accuracy than the existing schemes for a wide range of scenarios.
翻译:联邦学习(FL)是一种很有希望的技术,它使许多边缘装置能够在无线网络中合作训练机器学习模型。通过利用无线波形的超位性质,超空计算(AirComp)可以加速模型聚合,从而加速模型集,从而便利通信效率FL。由于通道衰退,电力控制在AirComp中至关重要。先前的工程假设,每个装置(即本地梯度)所要汇总的信号具有相同的统计数据。但在FL中,梯度统计在培训的反复度和特征方面各不相同,而且事先不为人所知。本文通过考虑梯度统计来研究超空FL的电源控制问题。目标是通过优化在受最高电力限制的每个装置的传输能力来尽量减少总电源错误。在给出梯度统计时,我们获得了封闭式的最佳政策。值得注意的是,我们显示最佳传输能力是连续和单质下降,而梯度矢量的平方多变系数(SMCV)则各不相同。我们然后提出一种方法,用可忽略的通信成本来估计梯度较高的统计数据。实验结果表明,现有梯度控制范围范围大于测试方案。