合成最佳最佳集体对数值 (Synthesizing Optimal Collective Algorithms) - 专知论文

会员服务 ·

0

优化器 · Performer · CASE · 张成子空间 · AMD ·

2021 年 1 月 4 日

Synthesizing Optimal Collective Algorithms

翻译：合成最佳最佳集体对数值

Zixian Cai,Zhengyang Liu,Saeed Maleki,Madan Musuvathi,Todd Mytkowicz,Jacob Nelson,Olli Saarikivi

from arxiv, Both Zixian Cai and Zhengyang Liu contributed equally to the paper. The work was done during internships at Microsoft Research. To appear at PPoPP 2021

Collective communication algorithms are an important component of distributed computation. Indeed, in the case of deep-learning, collective communication is the Amdahl's bottleneck of data-parallel training. This paper introduces SCCL (for Synthesized Collective Communication Library), a systematic approach to synthesize collective communication algorithms that are explicitly tailored to a particular hardware topology. SCCL synthesizes algorithms along the Pareto-frontier spanning from latency-optimal to bandwidth-optimal implementations of a collective. The paper demonstrates how to encode SCCL's synthesis as a quantifier-free SMT formula which can be discharged to a theorem prover. We further demonstrate how to scale our synthesis by exploiting symmetries in topologies and collectives. We synthesize and introduce novel latency and bandwidth optimal algorithms not seen in the literature on two popular hardware topologies. We also show how SCCL efficiently lowers algorithms to implementations on two hardware architectures (NVIDIA and AMD) and demonstrate competitive performance with hand optimized collective communication libraries.

翻译：集体通信算法是分布式计算的一个重要组成部分。事实上,在深层学习中,集体通信是Amdahl数据平行培训的瓶颈。本文介绍了SCCL(合成集体通信图书馆),这是综合集体通信算法的系统方法,明确针对特定硬件地形学。SCCLL综合了Pareto-frontier 的算法,从长期最佳到带宽最佳集体应用。文件展示了如何将SCCL合成编码为可排放到理论证明的无量化标准标准SMT公式。我们进一步展示了如何通过利用表象学和集体学中的配对法来扩大我们的合成。我们综合并介绍了在文献中未见的关于两种流行硬件表面学的新型嵌套法和带宽最佳算法。我们还展示了SCLL如何高效低的算法,在两种硬件结构(NVIDIA和AMD)上实施。我们展示了与手优化集体通信图书馆的竞争性表现。

1

相关内容

优化器

【2020新书】现代数据仓库，297页pdf，The Modern Data Warehouse in Azure

【2020新书】现代数据仓库，297页pdf，The Modern Data Warehouse in Azure

专知会员服务

58+阅读 · 2020年6月17日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

专知会员服务

103+阅读 · 2020年2月1日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【机器学习与深度学习基础性算法】Foundational ML and DL Algorithms

【机器学习与深度学习基础性算法】Foundational ML and DL Algorithms

专知会员服务

34+阅读 · 2019年12月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

已删除

将门创投

9+阅读 · 2017年7月28日

Measuring Mathematical Problem Solving With the MATH Dataset

Arxiv

0+阅读 · 2021年3月5日

Model-free two-step design for improving transient learning performance in nonlinear optimal regulator problems

Arxiv

0+阅读 · 2021年3月5日

Unsupervised Learning for Robust Fitting:A Reinforcement Learning Approach

Arxiv

0+阅读 · 2021年3月5日

Scalable Second Order Optimization for Deep Learning

Arxiv

0+阅读 · 2021年3月5日

Learning Deep Stochastic Optimal Control Policies using Forward-Backward SDEs

Arxiv

0+阅读 · 2021年3月4日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

A Study on Overfitting in Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年4月20日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

Towards Synthesizing Complex Programs from Input-Output Examples

Arxiv

3+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

张成子空间

相关VIP内容

【2020新书】现代数据仓库，297页pdf，The Modern Data Warehouse in Azure

【2020新书】现代数据仓库，297页pdf，The Modern Data Warehouse in Azure

专知会员服务

58+阅读 · 2020年6月17日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

【2020新书】C语言算法导论，Introducing Algorithms in C，174页pdf

专知会员服务

103+阅读 · 2020年2月1日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【机器学习与深度学习基础性算法】Foundational ML and DL Algorithms

【机器学习与深度学习基础性算法】Foundational ML and DL Algorithms

专知会员服务

34+阅读 · 2019年12月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

已删除

将门创投

9+阅读 · 2017年7月28日

相关论文

Measuring Mathematical Problem Solving With the MATH Dataset

Arxiv

0+阅读 · 2021年3月5日

Model-free two-step design for improving transient learning performance in nonlinear optimal regulator problems

Arxiv

0+阅读 · 2021年3月5日

Unsupervised Learning for Robust Fitting:A Reinforcement Learning Approach

Arxiv

0+阅读 · 2021年3月5日

Scalable Second Order Optimization for Deep Learning

Arxiv

0+阅读 · 2021年3月5日

Learning Deep Stochastic Optimal Control Policies using Forward-Backward SDEs

Arxiv

0+阅读 · 2021年3月4日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

A Study on Overfitting in Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年4月20日

CuLDA_CGS: Solving Large-scale LDA Problems on GPUs

Arxiv

3+阅读 · 2018年3月13日

Towards Synthesizing Complex Programs from Input-Output Examples

Arxiv

3+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员