OneFlow:从Scratch重新设计分布式深学习框架 (OneFlow: Redesign the Distributed Deep Learning Framework from Scratch) - 专知论文

会员服务 ·

1

数据并行 · MoDELS · 深度学习框架 · Extensibility · 学成 ·

2022 年 2 月 8 日

OneFlow: Redesign the Distributed Deep Learning Framework from Scratch

翻译：OneFlow:从Scratch重新设计分布式深学习框架

Jinhui Yuan,Xinqi Li,Cheng Cheng,Juncheng Liu,Ran Guo,Shenghang Cai,Chi Yao,Fei Yang,Xiaodong Yi,Chuan Wu,Haoran Zhang,Jie Zhao

Deep learning frameworks such as TensorFlow and PyTorch provide a productive interface for expressing and training a deep neural network (DNN) model on a single device or using data parallelism. Still, they may not be flexible or efficient enough in training emerging large models on distributed devices, which require more sophisticated parallelism beyond data parallelism. Plugins or wrappers have been developed to strengthen these frameworks for model or pipeline parallelism, but they complicate the usage and implementation of distributed deep learning. Aiming at a simple, neat redesign of distributed deep learning frameworks for various parallelism paradigms, we present OneFlow, a novel distributed training framework based on an SBP (split, broadcast and partial-value) abstraction and the actor model. SBP enables much easier programming of data parallelism and model parallelism than existing frameworks, and the actor model provides a succinct runtime mechanism to manage the complex dependencies imposed by resource constraints, data movement and computation in distributed deep learning. We demonstrate the general applicability and efficiency of OneFlow for training various large DNN models with case studies and extensive experiments. The results show that OneFlow outperforms many well-known customized libraries built on top of the state-of-the-art frameworks. The code of OneFlow is available at: https://github.com/Oneflow-Inc/oneflow.

翻译：TensorFlow 和 PyTorrch 等深层次学习框架为表达和培训关于单一装置或使用数据平行模型的深层神经网络模型提供了一个有成果的界面。不过,在培训分布式设备的新大型模型方面,它们可能不够灵活或效率,这要求比数据平行主义更复杂的平行主义。已经开发了插管或包装器,以加强这些模型或管道平行主义框架,但它们使分布式深层次学习的使用和实施复杂化。我们的目标是为各种平行模式的分布式深层学习框架进行简单、整洁的重新设计,我们介绍一个基于SBP(样板、广播和部分价值)抽象和行为者模型的新型分布式培训框架OneFlow。SBBP比现有框架更容易编制数据平行主义和模型,而行为者模型提供了一个简明的运行时间机制,用以管理资源限制、数据流动和计算在分布式深层次学习中带来的复杂依赖性。我们展示了OneFlow在培训各种大型 DNN模型和广泛实验中的一般适用性和效率。结果显示,OFlow-Froom-Froom Froom Froom 正在建立许多成熟的定制式数据库。

1

相关内容

数据并行

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

专知会员服务

121+阅读 · 2019年12月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

详解PyTorch中的ModuleList和Sequential

详解PyTorch中的ModuleList和Sequential

极市平台

0+阅读 · 2022年1月28日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

OSDI'21 | P3: Distributed Deep Graph Learning at Scale

OSDI'21 | P3: Distributed Deep Graph Learning at Scale

图与推荐

0+阅读 · 2021年9月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

基于现场测试的剪力墙结构多阶自振周期经验公式研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向微博的实时事件深度挖掘研究

国家自然科学基金

1+阅读 · 2014年12月31日

用晶格Boltzmann方法研究青光眼流体力学

国家自然科学基金

0+阅读 · 2014年12月31日

旧药新用的系统生物学研究

国家自然科学基金

0+阅读 · 2013年12月31日

双目标排序的近似算法

国家自然科学基金

0+阅读 · 2013年12月31日

VPS13C基因在高度近视发病机制中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

不确定性数据流自适应聚类分析及演化分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

复杂疾病中的若干统计方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

Deep Federated Learning for Autonomous Driving

Arxiv

0+阅读 · 2022年4月19日

Integrated and Adaptive Guidance and Control for Endoatmospheric Missiles via Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Distributed Learning of Deep Neural Networks using Independent Subnet Training

Arxiv

2+阅读 · 2022年4月18日

Distributed MST Computation in the Sleeping Model: Awake-Optimal Algorithms and Lower Bounds

Distributed MST Computation in the Sleeping Model: Awake-Optimal Algorithms and Lower Bounds

Arxiv

0+阅读 · 2022年4月18日

A Distributed and Elastic Aggregation Service for Scalable Federated Learning Systems

Arxiv

0+阅读 · 2022年4月16日

Analysis of Workflow Schedulers in Simulated Distributed Environments

Arxiv

0+阅读 · 2022年4月14日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

50+阅读 · 2021年1月6日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

VIP会员

文章信息

相关主题

深度学习框架

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

《动手学深度学习》(Dive into Deep Learning)PyTorch实现

专知会员服务

121+阅读 · 2019年12月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

最新《扩散模型原理》新书，470页pdf

无人机作战：演进、创新与未来战场

AI 智能体简史

多模态空间推理在大模型时代：综述与基准测试

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

详解PyTorch中的ModuleList和Sequential

详解PyTorch中的ModuleList和Sequential

极市平台

0+阅读 · 2022年1月28日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

OSDI'21 | P3: Distributed Deep Graph Learning at Scale

OSDI'21 | P3: Distributed Deep Graph Learning at Scale

图与推荐

0+阅读 · 2021年9月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

相关论文

Deep Federated Learning for Autonomous Driving

Arxiv

0+阅读 · 2022年4月19日

Integrated and Adaptive Guidance and Control for Endoatmospheric Missiles via Reinforcement Learning

Arxiv

0+阅读 · 2022年4月18日

Distributed Learning of Deep Neural Networks using Independent Subnet Training

Arxiv

2+阅读 · 2022年4月18日

Distributed MST Computation in the Sleeping Model: Awake-Optimal Algorithms and Lower Bounds

Distributed MST Computation in the Sleeping Model: Awake-Optimal Algorithms and Lower Bounds

Arxiv

0+阅读 · 2022年4月18日

A Distributed and Elastic Aggregation Service for Scalable Federated Learning Systems

Arxiv

0+阅读 · 2022年4月16日

Analysis of Workflow Schedulers in Simulated Distributed Environments

Arxiv

0+阅读 · 2022年4月14日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

50+阅读 · 2021年1月6日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

相关基金

基于现场测试的剪力墙结构多阶自振周期经验公式研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向微博的实时事件深度挖掘研究

国家自然科学基金

1+阅读 · 2014年12月31日

用晶格Boltzmann方法研究青光眼流体力学

国家自然科学基金

0+阅读 · 2014年12月31日

旧药新用的系统生物学研究

国家自然科学基金

0+阅读 · 2013年12月31日

双目标排序的近似算法

国家自然科学基金

0+阅读 · 2013年12月31日

VPS13C基因在高度近视发病机制中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

不确定性数据流自适应聚类分析及演化分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

复杂疾病中的若干统计方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员