通信-高效分配培训的适应性压缩 (Adaptive Compression for Communication-Efficient Distributed Training)

We propose Adaptive Compressed Gradient Descent (AdaCGD) - a novel optimization algorithm for communication-efficient training of supervised machine learning models with adaptive compression level. Our approach is inspired by the recently proposed three point compressor (3PC) framework of Richtarik et al. (2022), which includes error feedback (EF21), lazily aggregated gradient (LAG), and their combination as special cases, and offers the current state-of-the-art rates for these methods under weak assumptions. While the above mechanisms offer a fixed compression level, or adapt between two extremes only, our proposal is to perform a much finer adaptation. In particular, we allow the user to choose any number of arbitrarily chosen contractive compression mechanisms, such as Top-K sparsification with a user-defined selection of sparsification levels K, or quantization with a user-defined selection of quantization levels, or their combination. AdaCGD chooses the appropriate compressor and compression level adaptively during the optimization process. Besides i) proposing a theoretically-grounded multi-adaptive communication compression mechanism, we further ii) extend the 3PC framework to bidirectional compression, i.e., we allow the server to compress as well, and iii) provide sharp convergence bounds in the strongly convex, convex and nonconvex settings. The convex regime results are new even for several key special cases of our general mechanism, including 3PC and EF21. In all regimes, our rates are superior compared to all existing adaptive compression methods.

翻译：我们提出适应压缩源(AdaCGD), 这是一种新颖的优化算法, 用于对受监管的机器学习模式进行适应性压缩水平的通信高效培训。我们的方法受最近提议的Richtarik等人(2022年)三点压缩机(3PC)框架的启发, 其中包括错误反馈( EF21 ) 、悬浮汇总梯(LAGA) 及其作为特例的组合, 并提供了这些方法在虚弱假设下的现有最先进比率。虽然上述机制提供了固定压缩水平, 或只在两个极端之间进行调整, 我们的提议是进行更细的调整。特别是, 我们允许用户选择任意选择的三点压缩压缩机( 3PTop-K) 框架, 用户定义的宽度为 K, 或以用户定义的振动度选择, 或组合。 AdaCGDGD在优化过程中选择适当的压缩和压缩等级。除了提出有理论基础的多调整通信压缩机制, 我们甚至进一步将精细压缩三级的升级的服务器框架扩展为我们内部的固定的固定的固定的。

相关内容

CASES

关注 4

CASES：International Conference on Compilers, Architectures, and Synthesis for Embedded Systems。 Explanation：嵌入式系统编译器、体系结构和综合国际会议。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/cases/index.html

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日