优化与 " 联合业务方案 " 和 " 通向融合 " 共同业务方案分配培训DNN汇编 (Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion) - 专知论文

会员服务 ·

0

编译器 · Tensor · DNN · 优化器 · Learning ·

2022 年 9 月 26 日

Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion

翻译：优化与 " 联合业务方案 " 和 " 通向融合 " 共同业务方案分配培训DNN汇编

Xiaodong Yi,Shiwei Zhang,Lansong Diao,Chuan Wu,Zhen Zheng,Shiqing Fan,Siyu Wang,Jun Yang,Wei Lin

This paper proposes DisCo, an automatic deep learning compilation module for data-parallel distributed training. Unlike most deep learning compilers that focus on training or inference on a single device, DisCo optimizes a DNN model for distributed training over multiple GPU machines. Existing single-device compilation strategies do not work well in distributed training, due mainly to communication inefficiency that they incur. DisCo generates optimized, joint computation operator and communication tensor fusion strategies to enable highly efficient distributed training. A GNN-based simulator is built to effectively estimate per-iteration training time achieved by operator/tensor fusion candidates. A backtracking search algorithm is driven by the simulator, navigating efficiently in the large strategy space to identify good operator/tensor fusion strategies that minimize distributed training time. We compare DisCo with existing DL fusion schemes and show that it achieves good training speed-up close to the ideal, full computation-communication overlap case.

翻译：本文提出Disco,这是数据平行分布式培训的自动深学习汇编模块。Disco与大多数侧重于单一设备培训或推断的深学习汇编者不同,Disco优化了多台GPU机器分布式培训的DNN模式。现有的单设备汇编战略在分布式培训中效果不佳,主要原因是它们产生的沟通效率低下。Disco生成了优化、联合计算操作员和通信聚合战略,以便能够高效分布式培训。GNN模拟器的建立是为了有效估计操作员/加速聚变候选人完成的渗透式培训时间。由模拟器驱动的回溯跟踪搜索算法,在大型战略空间中高效地导航,以确定良好的操作员/加速聚变战略,最大限度地减少分布式培训时间。我们比较Disco与现有的DL聚变计划,并表明它能够实现与理想、全面计算-通信重叠案例相近的良好培训速度。

0

相关内容

编译器

编译器（Compiler），是一种计算机程序，它会将用某种编程语言写成的源代码（原始语言），转换成另一种编程语言（目标语言）。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

气固两相在循环流化床传质过程中的计算传质学方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

工程扰动下煤层气储层井周煤岩力学行为与稳定性研究

国家自然科学基金

0+阅读 · 2014年12月31日

多介质流体界面失稳和混合的跨尺度实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

p66shc-ROS轴介导猪早期胚胎体外发育阻滞机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

共轭聚合物定向阵列的构筑及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

填充介质管道流动中标量输运过程机理的实验观测、数值模拟与理论分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

无穷维动力系统的随机小扰动

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于NTF和HHT的空间碎片天基高光谱探测与识别方法

国家自然科学基金

0+阅读 · 2012年12月31日

Analysis of Noisy-target Training for DNN-based speech enhancement

Arxiv

0+阅读 · 2022年11月2日

Low-Cost Traffic Sensing System Based on LoRaWAN for Urban Areas

Arxiv

0+阅读 · 2022年11月2日

Distributed Massive MIMO for LEO Satellite Networks

Arxiv

0+阅读 · 2022年11月2日

Multirate Training of Neural Networks

Arxiv

0+阅读 · 2022年11月1日

SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates

Arxiv

0+阅读 · 2022年11月1日

Distributed Graph Neural Network Training: A Survey

Arxiv

16+阅读 · 2022年11月1日

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Arxiv

0+阅读 · 2022年10月31日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

IoT Solutions with Multi-Sensor Fusion and Signal-Image Encoding for Secure Data Transfer and Decision Making

Arxiv

37+阅读 · 2021年6月2日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Analysis of Noisy-target Training for DNN-based speech enhancement

Arxiv

0+阅读 · 2022年11月2日

Low-Cost Traffic Sensing System Based on LoRaWAN for Urban Areas

Arxiv

0+阅读 · 2022年11月2日

Distributed Massive MIMO for LEO Satellite Networks

Arxiv

0+阅读 · 2022年11月2日

Multirate Training of Neural Networks

Arxiv

0+阅读 · 2022年11月1日

SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates

Arxiv

0+阅读 · 2022年11月1日

Distributed Graph Neural Network Training: A Survey

Arxiv

16+阅读 · 2022年11月1日

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Arxiv

0+阅读 · 2022年10月31日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

IoT Solutions with Multi-Sensor Fusion and Signal-Image Encoding for Secure Data Transfer and Decision Making

Arxiv

37+阅读 · 2021年6月2日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

气固两相在循环流化床传质过程中的计算传质学方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

工程扰动下煤层气储层井周煤岩力学行为与稳定性研究

国家自然科学基金

0+阅读 · 2014年12月31日

多介质流体界面失稳和混合的跨尺度实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

p66shc-ROS轴介导猪早期胚胎体外发育阻滞机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

共轭聚合物定向阵列的构筑及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

填充介质管道流动中标量输运过程机理的实验观测、数值模拟与理论分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

无穷维动力系统的随机小扰动

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于NTF和HHT的空间碎片天基高光谱探测与识别方法

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员