In order to achieve the dual goals of privacy and learning across distributed data, Federated Learning (FL) systems rely on frequent exchanges of large files (model updates) between a set of clients and the server. As such FL systems are exposed to, or indeed the cause of, congestion across a wide set of network resources. Lossy compression can be used to reduce the size of exchanged files and associated delays, at the cost of adding noise to model updates. By judiciously adapting clients' compression to varying network congestion, an FL application can reduce wall clock training time. To that end, we propose a Network Adaptive Compression (NAC-FL) policy, which dynamically varies the client's lossy compression choices to network congestion variations. We prove, under appropriate assumptions, that NAC-FL is asymptotically optimal in terms of directly minimizing the expected wall clock training time. Further, we show via simulation that NAC-FL achieves robust performance improvements with higher gains in settings with positively correlated delays across time.
翻译:为了在分布式数据中实现隐私和学习的双重目标,联邦学习(FL)系统依靠一组客户和服务器之间频繁交换大型文件(模式更新),因为这种FL系统暴露于广泛的网络资源中,或确实造成网络资源拥堵的原因。 损失压缩可以用来减少交换文件的大小和相关延误,代价是在模型更新中增加噪音。通过明智地调整客户压缩以适应不同的网络拥堵,FL应用程序可以减少墙钟培训时间。为此,我们提议了一个网络适应压缩(NAC-FL)政策,该政策动态地将客户损失压缩选择与网络拥堵变化区别开来。根据适当的假设,我们证明NAC-FL在直接尽量减少预期的墙钟培训时间方面是同样最佳的。此外,我们通过模拟来显示,NAC-FLA取得了强劲的业绩改进,在各种环境下取得了更高的收益,同时始终存在积极相关的延误。