Federated learning is a powerful distributed learning scheme that allows numerous edge devices to collaboratively train a model without sharing their data. However, training is resource-intensive for edge devices, and limited network bandwidth is often the main bottleneck. Prior work often overcomes the constraints by condensing the models or messages into compact formats, e.g., by gradient compression or distillation. In contrast, we propose ProgFed, the first progressive training framework for efficient and effective federated learning. It inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models. We theoretically prove that ProgFed converges at the same asymptotic rate as standard training on full models. Extensive results on a broad range of architectures, including CNNs (VGG, ResNet, ConvNets) and U-nets, and diverse tasks from simple classification to medical image segmentation show that our highly effective training approach saves up to $20\%$ computation and up to $63\%$ communication costs for converged models. As our approach is also complimentary to prior work on compression, we can achieve a wide range of trade-offs, showing reduced communication of up to $50\times$ at only $0.1\%$ loss in utility.
翻译:联邦学习是一种强大的分布式学习计划,它允许许多边缘设备在不分享数据的情况下合作培训模型,然而,培训是边缘设备的资源密集型,而有限的网络带宽往往是主要的瓶颈。先前的工作往往克服了制约因素,将模型或信息凝结成压缩或蒸馏等紧凑格式,例如斜度压缩或蒸馏,从而克服了这些制约因素。相反,我们提议了第一个进步培训框架,即促进高效和有效的联合学习的渐进式培训框架。它必然减少计算和双向通信费用,同时保持最终模型的强力性能。我们理论上证明,所预测的模型与全模式的标准培训相同,以同样的零合作率趋同。关于广泛的结构的广泛成果,包括CNN(VG、ResNet、ConvNets)和U-net),以及从简单分类到医学图像分割的不同任务,表明我们非常有效的培训方法可以节省高达20美元的计算和高达63美元的通信费用,同时保持最后模型的强大性能。我们的方法也与先前的压缩工作相辅相成,我们只能在50美元到10美元之间实现广泛的贸易损失。