Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism - 专知论文

会员服务 ·

0

Backbone · MoDELS · Tensor · 缩放 · Extensibility ·

2023 年 4 月 22 日

Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism

翻译：暂无翻译

Xin Chen,Hengheng Zhang,Xiaotao Gu,Kaifeng Bi,Lingxi Xie,Qi Tian

from arxiv, A novel framework for MoE models. Work in progress

The Mixture of Experts (MoE) model becomes an important choice of large language models nowadays because of its scalability with sublinear computational complexity for training and inference. However, existing MoE models suffer from two critical drawbacks, 1) tremendous inner-node and inter-node communication overhead introduced by all-to-all dispatching and gathering, and 2) limited scalability for the backbone because of the bound data parallel and expert parallel to scale in the expert dimension. In this paper, we systematically analyze these drawbacks in terms of training efficiency in the parallel framework view and propose a novel MoE architecture called Pipeline MoE (PPMoE) to tackle them. PPMoE builds expert parallel incorporating with tensor parallel and replaces communication-intensive all-to-all dispatching and gathering with a simple tensor index slicing and inner-node all-reduce. Besides, it is convenient for PPMoE to integrate pipeline parallel to further scale the backbone due to its flexible parallel architecture. Extensive experiments show that PPMoE not only achieves a more than $1.75\times$ speed up compared to existing MoE architectures but also reaches $90\%$ throughput of its corresponding backbone model that is $20\times$ smaller.

翻译：暂无翻译

0

相关内容

Backbone

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

迁移侵袭抑制蛋白 MIIP对有氧酵解介导的结肠肿瘤EMT的调控作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

setdb1与Tiam1相互作用通过调控EMT促进肝癌侵袭转移

国家自然科学基金

0+阅读 · 2015年12月31日

USP25基因在非小细胞肺癌转移中的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PTEN、SHIP和CTMP对糖尿病肾病的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

深海放线菌Streptomyces sp. SCSIO 03032抗肿瘤天然产物Spiroindimicins生物合成研究

国家自然科学基金

0+阅读 · 2012年12月31日

PAK4与SCG10相互作用在胃癌细胞侵袭转移中的作用及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

氮杂双环手性有序介孔有机硅催化剂的设计、合成及催化反应研究

国家自然科学基金

1+阅读 · 2009年12月31日

印迹基因TSSC3在骨肉瘤失巢凋亡过程中的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

ModuleFormer: Learning Modular Large Language Models From Uncurated Data

Arxiv

0+阅读 · 2023年6月7日

Balanced Product of Calibrated Experts for Long-Tailed Recognition

Arxiv

0+阅读 · 2023年6月7日

Solving NP-hard Min-max Routing Problems as Sequential Generation with Equity Context

Arxiv

0+阅读 · 2023年6月7日

Soft Merging of Experts with Adaptive Routing

Soft Merging of Experts with Adaptive Routing

Arxiv

0+阅读 · 2023年6月6日

Tutel: Adaptive Mixture-of-Experts at Scale

Arxiv

0+阅读 · 2023年6月5日

COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local Search

Arxiv

0+阅读 · 2023年6月5日

NFTVis: Visual Analysis of NFT Performance

Arxiv

0+阅读 · 2023年6月5日

ODIN: Overcoming Dynamic Interference in iNference pipelines

Arxiv

0+阅读 · 2023年6月2日

Specifying and Verifying Persistent Libraries

Arxiv

0+阅读 · 2023年6月2日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

VIP会员

文章信息

相关主题

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

机器人领域中最佳的三维场景表示是什么？——从几何表示到基础模型

《多域作战兵棋推演：运用形态学分析与人工智能加强国防人员训练》

【博士论文】快速高效的归一化流及其在图像生成模型中的应用

仿生机器人技术的军事应用

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

ModuleFormer: Learning Modular Large Language Models From Uncurated Data

Arxiv

0+阅读 · 2023年6月7日

Balanced Product of Calibrated Experts for Long-Tailed Recognition

Arxiv

0+阅读 · 2023年6月7日

Solving NP-hard Min-max Routing Problems as Sequential Generation with Equity Context

Arxiv

0+阅读 · 2023年6月7日

Soft Merging of Experts with Adaptive Routing

Soft Merging of Experts with Adaptive Routing

Arxiv

0+阅读 · 2023年6月6日

Tutel: Adaptive Mixture-of-Experts at Scale

Arxiv

0+阅读 · 2023年6月5日

COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local Search

Arxiv

0+阅读 · 2023年6月5日

NFTVis: Visual Analysis of NFT Performance

Arxiv

0+阅读 · 2023年6月5日

ODIN: Overcoming Dynamic Interference in iNference pipelines

Arxiv

0+阅读 · 2023年6月2日

Specifying and Verifying Persistent Libraries

Arxiv

0+阅读 · 2023年6月2日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

相关基金

迁移侵袭抑制蛋白 MIIP对有氧酵解介导的结肠肿瘤EMT的调控作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

setdb1与Tiam1相互作用通过调控EMT促进肝癌侵袭转移

国家自然科学基金

0+阅读 · 2015年12月31日

USP25基因在非小细胞肺癌转移中的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PTEN、SHIP和CTMP对糖尿病肾病的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

深海放线菌Streptomyces sp. SCSIO 03032抗肿瘤天然产物Spiroindimicins生物合成研究

国家自然科学基金

0+阅读 · 2012年12月31日

PAK4与SCG10相互作用在胃癌细胞侵袭转移中的作用及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

氮杂双环手性有序介孔有机硅催化剂的设计、合成及催化反应研究

国家自然科学基金

1+阅读 · 2009年12月31日

印迹基因TSSC3在骨肉瘤失巢凋亡过程中的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员