3PC: 传播-有效分配培训的三点压缩器和拉兹聚合的更好理论 (3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation) - 专知论文

会员服务 ·

0

Better · 非凸 · state-of-the-art · AIM · 生成方法 ·

2022 年 2 月 2 日

3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation

翻译：3PC: 传播-有效分配培训的三点压缩器和拉兹聚合的更好理论

Peter Richtárik,Igor Sokolov,Ilyas Fatkhullin,Elnur Gasanov,Zhize Li,Eduard Gorbunov

from arxiv, 52 pages

We propose and study a new class of gradient communication mechanisms for communication-efficient training -- three point compressors (3PC) -- as well as efficient distributed nonconvex optimization algorithms that can take advantage of them. Unlike most established approaches, which rely on a static compressor choice (e.g., Top-$K$), our class allows the compressors to {\em evolve} throughout the training process, with the aim of improving the theoretical communication complexity and practical efficiency of the underlying methods. We show that our general approach can recover the recently proposed state-of-the-art error feedback mechanism EF21 (Richt\'arik et al., 2021) and its theoretical properties as a special case, but also leads to a number of new efficient methods. Notably, our approach allows us to improve upon the state of the art in the algorithmic and theoretical foundations of the {\em lazy aggregation} literature (Chen et al., 2018). As a by-product that may be of independent interest, we provide a new and fundamental link between the lazy aggregation and error feedback literature. A special feature of our work is that we do not require the compressors to be unbiased.

翻译：我们建议并研究一个新的梯度通信机制,用于通信效率培训 -- -- 3点压缩机(3PC) -- -- 以及高效分布的非碳化优化算法,这些算法可以利用这些算法。与大多数既定方法不同,这些方法依赖静态压缩机选择(例如,Top-$K$),我们的分类允许压缩机在整个培训过程中不断演进,目的是提高理论通信的复杂性和基本方法的实际效率。我们表明,我们的一般方法可以恢复最近提议的最先进的错误反馈机制EF21(Richt\'arik等人,2021年)及其理论特性,将其作为一个特殊案例,但也导致若干新的有效方法。值得注意的是,我们的方法使我们能够改进[Lem 懒惰汇总] 文学的算法和理论基础的艺术状况(Chen 等人,2018年)。作为可能具有独立兴趣的副产品,我们在懒惰汇总和错误反馈文献之间提供了一个新的和根本的联系。我们工作的特征是,我们的工作要求不要求分析师保持公正。

0

相关内容

Better

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

RSK2介导Ras/MAPK对PTEN/Akt的调控作用：肠癌EGFR单抗获得性耐药的新机制？

国家自然科学基金

0+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

可并行化计算雷达网信号级融合检测算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

高阶矢量调制格式信号光纤传输损伤及其数字相干接收研究

国家自然科学基金

0+阅读 · 2012年12月31日

无线通信系统压缩采样定时同步机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

目标函数多次波逆时叠前偏移

国家自然科学基金

0+阅读 · 2012年12月31日

稀疏张量学习理论

国家自然科学基金

1+阅读 · 2012年12月31日

基于网络采样的分布式压缩传感系统设计与研究

国家自然科学基金

0+阅读 · 2012年12月31日

相关于算子的Orlicz-型函数空间的实变理论

国家自然科学基金

0+阅读 · 2011年12月31日

Overlay结构特性对网络攻击的影响的仿真分析

国家自然科学基金

0+阅读 · 2010年12月31日

Distributed MST Computation in the Sleeping Model: Awake-Optimal Algorithms and Lower Bounds

Distributed MST Computation in the Sleeping Model: Awake-Optimal Algorithms and Lower Bounds

Arxiv

0+阅读 · 2022年4月18日

How to Attain Communication-Efficient DNN Training? Convert, Compress, Correct

Arxiv

0+阅读 · 2022年4月18日

On Arbitrary Compression for Decentralized Consensus and Stochastic Optimization over Directed Networks

Arxiv

0+阅读 · 2022年4月18日

SASG: Sparsification with Adaptive Stochastic Gradients for Communication-efficient Distributed Learning

Arxiv

0+阅读 · 2022年4月17日

A Distributed and Elastic Aggregation Service for Scalable Federated Learning Systems

Arxiv

0+阅读 · 2022年4月16日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Distributed MST Computation in the Sleeping Model: Awake-Optimal Algorithms and Lower Bounds

Distributed MST Computation in the Sleeping Model: Awake-Optimal Algorithms and Lower Bounds

Arxiv

0+阅读 · 2022年4月18日

How to Attain Communication-Efficient DNN Training? Convert, Compress, Correct

Arxiv

0+阅读 · 2022年4月18日

On Arbitrary Compression for Decentralized Consensus and Stochastic Optimization over Directed Networks

Arxiv

0+阅读 · 2022年4月18日

SASG: Sparsification with Adaptive Stochastic Gradients for Communication-efficient Distributed Learning

Arxiv

0+阅读 · 2022年4月17日

A Distributed and Elastic Aggregation Service for Scalable Federated Learning Systems

Arxiv

0+阅读 · 2022年4月16日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

相关基金

RSK2介导Ras/MAPK对PTEN/Akt的调控作用：肠癌EGFR单抗获得性耐药的新机制？

国家自然科学基金

0+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

可并行化计算雷达网信号级融合检测算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

高阶矢量调制格式信号光纤传输损伤及其数字相干接收研究

国家自然科学基金

0+阅读 · 2012年12月31日

无线通信系统压缩采样定时同步机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

目标函数多次波逆时叠前偏移

国家自然科学基金

0+阅读 · 2012年12月31日

稀疏张量学习理论

国家自然科学基金

1+阅读 · 2012年12月31日

基于网络采样的分布式压缩传感系统设计与研究

国家自然科学基金

0+阅读 · 2012年12月31日

相关于算子的Orlicz-型函数空间的实变理论

国家自然科学基金

0+阅读 · 2011年12月31日

Overlay结构特性对网络攻击的影响的仿真分析

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员