Pipe-BD: 管道平行平行阻断蒸馏 (Pipe-BD: Pipelined Parallel Blockwise Distillation) - 专知论文

会员服务 ·

0

蒸馏 · MoDELS · 块 · state-of-the-art · Performer ·

2023 年 1 月 29 日

Pipe-BD: Pipelined Parallel Blockwise Distillation

翻译：Pipe-BD: 管道平行平行阻断蒸馏

Hongsun Jang,Jaewon Jung,Jaeyong Song,Joonsang Yu,Youngsok Kim,Jinho Lee

from arxiv, To appear at DATE'23

Training large deep neural network models is highly challenging due to their tremendous computational and memory requirements. Blockwise distillation provides one promising method towards faster convergence by splitting a large model into multiple smaller models. In state-of-the-art blockwise distillation methods, training is performed block-by-block in a data-parallel manner using multiple GPUs. To produce inputs for the student blocks, the teacher model is executed from the beginning until the current block under training. However, this results in a high overhead of redundant teacher execution, low GPU utilization, and extra data loading. To address these problems, we propose Pipe-BD, a novel parallelization method for blockwise distillation. Pipe-BD aggressively utilizes pipeline parallelism for blockwise distillation, eliminating redundant teacher block execution and increasing per-device batch size for better resource utilization. We also extend to hybrid parallelism for efficient workload balancing. As a result, Pipe-BD achieves significant acceleration without modifying the mathematical formulation of blockwise distillation. We implement Pipe-BD on PyTorch, and experiments reveal that Pipe-BD is effective on multiple scenarios, models, and datasets.

翻译：深心神经网络模型因其巨大的计算和内存要求而具有高度的挑战性。粗略蒸馏提供了一种有希望的方法,通过将一个大模型分成多个小模型来更快地融合。在最先进的整块蒸馏方法中,使用多个GPU,以数据平行的方式进行逐条逐条培训。为了为学生街区提供投入,教师模型从一开始一直执行到目前培训阶段。然而,这导致多余教师执行、低GPU利用率和额外数据加载的间接费用很高。为了解决这些问题,我们建议Pipe-BD,这是条状蒸馏的新型平行方法。管道-BD积极利用管道平行法进行整块蒸馏,消除多余的教师区块执行,并增加每条分批量,以便更好地利用资源。为了有效平衡工作量,我们还将这种模式推广到混合平行。作为结果,管道-BD在不修改块化蒸馏的数学配方方面实现显著加速。我们在Pipe-BD PyTorch上实施管道-D, 实验显示Pipe-BD是多种情景的有效的数据模型。

0

相关内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【SIGIR2021】ScaleFreeCTR：超大规模Embedding推荐模型分布式训练系统

专知会员服务

28+阅读 · 2021年4月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

水力梯度作用下填埋场压实黏土盖层开裂失效机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

大气氮硫化合物间的非均相耦合转化及其对灰霾形成的影响

国家自然科学基金

0+阅读 · 2014年12月31日

东亚东北海域云上气溶胶直接辐射强迫研究

国家自然科学基金

0+阅读 · 2013年12月31日

低秩矩阵复原的Schatten-q(0<q<1)正则化理论与算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

金属玻璃及其液相的微观结构和热力学的核磁共振表征

国家自然科学基金

0+阅读 · 2012年12月31日

黏性土采动变形-渗透耦合效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

多能性转录因子和KDMs的相互关系及在重编程过程中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Ter94在Hedgehog信号转导途径中的作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

天然有机质与PPCPs相互作用机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation

Arxiv

0+阅读 · 2023年3月20日

Dual-stream Time-Delay Neural Network with Dynamic Global Filter for Speaker Verification

Arxiv

0+阅读 · 2023年3月20日

Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models

Arxiv

0+阅读 · 2023年3月19日

The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework

Arxiv

0+阅读 · 2023年3月19日

Self-Distillation for Further Pre-training of Transformers

Arxiv

0+阅读 · 2023年3月17日

Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval

Arxiv

0+阅读 · 2023年3月16日

Block-wise Bit-Compression of Transformer-based Models

Arxiv

0+阅读 · 2023年3月16日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

35+阅读 · 2022年4月25日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

27+阅读 · 2021年6月16日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【SIGIR2021】ScaleFreeCTR：超大规模Embedding推荐模型分布式训练系统

专知会员服务

28+阅读 · 2021年4月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation

Arxiv

0+阅读 · 2023年3月20日

Dual-stream Time-Delay Neural Network with Dynamic Global Filter for Speaker Verification

Arxiv

0+阅读 · 2023年3月20日

Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models

Arxiv

0+阅读 · 2023年3月19日

The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework

Arxiv

0+阅读 · 2023年3月19日

Self-Distillation for Further Pre-training of Transformers

Arxiv

0+阅读 · 2023年3月17日

Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval

Arxiv

0+阅读 · 2023年3月16日

Block-wise Bit-Compression of Transformer-based Models

Arxiv

0+阅读 · 2023年3月16日

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Arxiv

35+阅读 · 2022年4月25日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

27+阅读 · 2021年6月16日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

水力梯度作用下填埋场压实黏土盖层开裂失效机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

大气氮硫化合物间的非均相耦合转化及其对灰霾形成的影响

国家自然科学基金

0+阅读 · 2014年12月31日

东亚东北海域云上气溶胶直接辐射强迫研究

国家自然科学基金

0+阅读 · 2013年12月31日

低秩矩阵复原的Schatten-q(0<q<1)正则化理论与算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

金属玻璃及其液相的微观结构和热力学的核磁共振表征

国家自然科学基金

0+阅读 · 2012年12月31日

黏性土采动变形-渗透耦合效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

多能性转录因子和KDMs的相互关系及在重编程过程中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Ter94在Hedgehog信号转导途径中的作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

天然有机质与PPCPs相互作用机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员