Efficiently running federated learning (FL) on resource-constrained devices is challenging since they are required to train computationally intensive deep neural networks (DNN) independently. DNN partitioning-based FL (DPFL) has been proposed as one mechanism to accelerate training where the layers of a DNN (or computation) are offloaded from the device to an edge server. However, this creates significant communication overheads since the activation and gradient need to be transferred between the device and the edge server during training. Current techniques reduce the communication introduced by DNN partitioning using local loss-based methods. We demonstrate that these methods adversely impact accuracy and ignore the communication costs incurred when transmitting the activation from the device to the server. This paper proposes ActionFed - a communication efficient framework for DPFL to accelerate training on resource-constrained devices. ActionFed eliminates the transmission of the gradient by developing pre-trained initialization of the DNN model on the device for the first time. This reduces the accuracy degradation seen in local loss-based methods. In addition, ActionFed proposes a novel replay buffer mechanism and implements a quantization-based compression technique to reduce the transmission of the activation. It is experimentally demonstrated that ActionFed can reduce the communication cost by up to 15.77x and accelerates training by up to 3.87x when compared to vanilla DPFL.
翻译:联邦学习(FL)在资源受限设备上高效运行的挑战是训练计算密集的深度神经网络(DNN)。DNN划分的FL(DPFL)被提出作为一种机制来加速训练,其中DNN的层(或计算)从设备移到边缘服务器。然而,此操作会产生大量的通信开销,因为在训练期间,激活和梯度需要在设备和边缘服务器之间传输。目前的技术使用基于本地损失的方法来减少DNN划分引入的通信。我们证明了这些方法对精度产生不利影响,而且忽略了在将激活从设备传输到服务器时产生的通信成本。本文提出了一种名为ActionFed的通信高效DPFL框架,旨在加速在资源受限设备上的训练。ActionFed通过首次在设备上开发预训练的DNN模型的初始化来消除梯度传输,从而减少了基于本地损失的方法中出现的精度下降。此外,ActionFed提出了一种新颖的重放缓冲区机制,并实现了一种基于量化的压缩技术以减少激活的传输。实验结果表明,与香草DPFL相比,ActionFed可以将通信成本降低高达15.77倍,训练加速高达3.87倍。