S2 减少: 快速分配深层学习的高性能差分通信 (S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning) - 专知论文

会员服务 ·

0

可约的 · 稀疏 · SGD · state-of-the-art · Extensibility ·

2021 年 11 月 26 日

S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning

翻译：S2 减少: 快速分配深层学习的高性能差分通信

Keshi Ge,Yongquan Fu,Zhiquan Lai,Xiaoge Deng,Dongsheng Li

from arxiv, 10 pages, 8 figures

Distributed stochastic gradient descent (SGD) approach has been widely used in large-scale deep learning, and the gradient collective method is vital to ensure the training scalability of the distributed deep learning system. Collective communication such as AllReduce has been widely adopted for the distributed SGD process to reduce the communication time. However, AllReduce incurs large bandwidth resources while most gradients are sparse in many cases since many gradient values are zeros and should be efficiently compressed for bandwidth saving. To reduce the sparse gradient communication overhead, we propose Sparse-Sketch Reducer (S2 Reducer), a novel sketch-based sparse gradient aggregation method with convergence guarantees. S2 Reducer reduces the communication cost by only compressing the non-zero gradients with count-sketch and bitmap, and enables the efficient AllReduce operators for parallel SGD training. We perform extensive evaluation against four state-of-the-art methods over five training models. Our results show that S2 Reducer converges to the same accuracy, reduces 81\% sparse communication overhead, and achieves 1.8$ \times $ speedup compared to state-of-the-art approaches.

翻译：在大型深层学习中广泛采用分布式梯度梯度下降(SGD)方法,而梯度集体方法对于确保分布式深层学习系统的培训可扩展性至关重要; 分散式 SGD 进程广泛采用AllRedue等集体通信,以减少通信时间; 然而, AllReduce 产生大型带宽资源,而由于许多梯度值为零,在许多情况下,大多数梯度是稀疏的,应当为节省带宽而有效压缩; 为了减少稀薄的梯度通信间接费用,我们建议采用基于草图的稀释式稀释梯度汇总法(S2递减器),这是一种具有趋同保证的新颖的、基于草图的稀释梯度汇总法。 S2 降低通信成本,只需用点数和位图压缩非零梯度梯度来压缩非零梯度梯度的通信费用,使高效的Alluceuse操作员能够进行平行SGD培训。我们对五个培训模式的四种最先进的方法进行了广泛的评价。我们的结果表明,S2 降低器与相同的精度接近于相同的精度,减少了81 ⁇ 稀释式通信间接费用,减少了81 ⁇ 稀释式通信间接费用,并达到1.8美元。

0

相关内容

可约的

【图与几何深度学习，53页ppt】Graph and geometric deep learning

专知会员服务

90+阅读 · 2021年6月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

专知会员服务

68+阅读 · 2020年4月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

专知会员服务

32+阅读 · 2019年10月12日

Federated Learning: 架构

Federated Learning: 架构

AINLP

4+阅读 · 2020年9月20日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知

6+阅读 · 2020年1月16日

人工智能 | ACCV 2020等国际会议信息5条

人工智能 | ACCV 2020等国际会议信息5条

Call4Papers

6+阅读 · 2019年6月21日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Communication-Efficient Consensus Mechanism for Federated Reinforcement Learning

Arxiv

0+阅读 · 2022年1月30日

FedLite: A Scalable Approach for Federated Learning on Resource-constrained Clients

Arxiv

0+阅读 · 2022年1月28日

A Deep Reinforcement Learning Approach for Service Migration in MEC-enabled Vehicular Networks

Arxiv

0+阅读 · 2022年1月27日

OneFlow: Redesign the Distributed Deep Learning Framework from Scratch

Arxiv

0+阅读 · 2022年1月27日

Data-Aware Device Scheduling for Federated Edge Learning

Arxiv

0+阅读 · 2022年1月27日

Data-Quality Based Scheduling for Federated Edge Learning

Arxiv

0+阅读 · 2022年1月27日

Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

Arxiv

4+阅读 · 2021年6月18日

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Arxiv

11+阅读 · 2020年2月18日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【图与几何深度学习，53页ppt】Graph and geometric deep learning

专知会员服务

90+阅读 · 2021年6月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

专知会员服务

68+阅读 · 2020年4月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

【Pieter Abbeel 报告@CMU】元学习与深度强化学习机器人应用，Deep Learning to Learn，84页ppt

专知会员服务

32+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Federated Learning: 架构

Federated Learning: 架构

AINLP

4+阅读 · 2020年9月20日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知

6+阅读 · 2020年1月16日

人工智能 | ACCV 2020等国际会议信息5条

人工智能 | ACCV 2020等国际会议信息5条

Call4Papers

6+阅读 · 2019年6月21日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Communication-Efficient Consensus Mechanism for Federated Reinforcement Learning

Arxiv

0+阅读 · 2022年1月30日

FedLite: A Scalable Approach for Federated Learning on Resource-constrained Clients

Arxiv

0+阅读 · 2022年1月28日

A Deep Reinforcement Learning Approach for Service Migration in MEC-enabled Vehicular Networks

Arxiv

0+阅读 · 2022年1月27日

OneFlow: Redesign the Distributed Deep Learning Framework from Scratch

Arxiv

0+阅读 · 2022年1月27日

Data-Aware Device Scheduling for Federated Edge Learning

Arxiv

0+阅读 · 2022年1月27日

Data-Quality Based Scheduling for Federated Edge Learning

Arxiv

0+阅读 · 2022年1月27日

Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

Arxiv

4+阅读 · 2021年6月18日

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Arxiv

11+阅读 · 2020年2月18日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

微信扫码咨询专知VIP会员