综合最佳平行主义安置和减少深层学习的等级系统战略 (Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning) - 专知论文

会员服务 ·

0

可约的 · 优化器 · Performer · 模型并行 · 学成 ·

2021 年 10 月 20 日

Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning

翻译：综合最佳平行主义安置和减少深层学习的等级系统战略

Ningning Xie,Tamara Norman,Dominik Grewe,Dimitrios Vytiniotis

from arxiv, Submitted to the 5th MLSys Conference

We present a novel characterization of the mapping of multiple parallelism forms (e.g. data and model parallelism) onto hierarchical accelerator systems that is hierarchy-aware and greatly reduces the space of software-to-hardware mapping. We experimentally verify the substantial effect of these mappings on all-reduce performance (up to 448x). We offer a novel syntax-guided program synthesis framework that is able to decompose reductions over one or more parallelism axes to sequences of collectives in a hierarchy- and mapping-aware way. For 69% of parallelism placements and user requested reductions, our framework synthesizes programs that outperform the default all-reduce implementation when evaluated on different GPU hierarchies (max 2.04x, average 1.27x). We complement our synthesis tool with a simulator exceeding 90% top-10 accuracy, which therefore reduces the need for massive evaluations of synthesis results to determine a small set of optimal programs and mappings.

翻译：我们展示了一种新颖的特征,将多重平行形式(如数据和模型平行)的绘图描述为等级级加速器系统,这种系统具有等级意识,大大缩小了软件到硬件的绘图空间。我们实验性地核实了这些绘图对全减性能(最高为448x)产生的实质性影响。我们提供了一个新的语法指导程序合成框架,能够将一个或一个以上平行轴的削减分解成一个等级级和绘图认知方式的集体序列。对于69%的平行安置和用户要求的削减,我们的框架综合了在对不同的GPU结构进行评估时超过默认的全减量执行程序(平均为2.04x,平均为1.27x)。我们用一个超过90%的十强精度的模拟器来补充我们的合成工具,因此减少了大规模合成结果评估的必要性,以确定一套小型的最佳方案和绘图。

0

相关内容

可约的

【干货书】图、网络与算法，655页pdf，Graphs, Networks，and Algorithms

【干货书】图、网络与算法，655页pdf，Graphs, Networks，and Algorithms

专知会员服务

95+阅读 · 2021年9月21日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

专知会员服务

41+阅读 · 2019年12月27日

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

专知会员服务

118+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

5+阅读 · 2018年7月25日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Dynamic resource allocation for efficient parallel CFD simulations

Arxiv

0+阅读 · 2021年12月17日

A Robust Optimization Approach to Deep Learning

Arxiv

0+阅读 · 2021年12月17日

Specification Decomposition for Reactive Synthesis

Arxiv

0+阅读 · 2021年12月16日

On variance estimation for the one-sample log-rank test

Arxiv

0+阅读 · 2021年12月16日

An Enhanced Binary Particle-Swarm Optimization (E-BPSO) Algorithm for Service Placement in Hybrid Cloud Platforms

Arxiv

0+阅读 · 2021年12月16日

OptABC: an Optimal Hyperparameter Tuning Approach for Machine Learning Algorithms

Arxiv

0+阅读 · 2021年12月15日

Bayesian Search for Robust Optima

Arxiv

0+阅读 · 2021年12月15日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

A Framework of Transfer Learning in Object Detection for Embedded Systems

Arxiv

3+阅读 · 2018年11月12日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】图、网络与算法，655页pdf，Graphs, Networks，and Algorithms

【干货书】图、网络与算法，655页pdf，Graphs, Networks，and Algorithms

专知会员服务

95+阅读 · 2021年9月21日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

【开放书】部分观测动态系统的贝叶斯学习，119页pdf，Bayesian Learning for partially observed dynamical systems

专知会员服务

41+阅读 · 2019年12月27日

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

专知会员服务

118+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

5+阅读 · 2018年7月25日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Dynamic resource allocation for efficient parallel CFD simulations

Arxiv

0+阅读 · 2021年12月17日

A Robust Optimization Approach to Deep Learning

Arxiv

0+阅读 · 2021年12月17日

Specification Decomposition for Reactive Synthesis

Arxiv

0+阅读 · 2021年12月16日

On variance estimation for the one-sample log-rank test

Arxiv

0+阅读 · 2021年12月16日

An Enhanced Binary Particle-Swarm Optimization (E-BPSO) Algorithm for Service Placement in Hybrid Cloud Platforms

Arxiv

0+阅读 · 2021年12月16日

OptABC: an Optimal Hyperparameter Tuning Approach for Machine Learning Algorithms

Arxiv

0+阅读 · 2021年12月15日

Bayesian Search for Robust Optima

Arxiv

0+阅读 · 2021年12月15日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

A Framework of Transfer Learning in Object Detection for Embedded Systems

Arxiv

3+阅读 · 2018年11月12日

微信扫码咨询专知VIP会员