国家劳动力规划模式现代分配数据-拉拉大型培训前战略 (Modern Distributed Data-Parallel Large-Scale Pre-training Strategies For NLP models) - 专知论文

会员服务 ·

0

Horovod · Learning · MoDELS · 服务器 · 语言模型化 ·

2022 年 6 月 13 日

Modern Distributed Data-Parallel Large-Scale Pre-training Strategies For NLP models

翻译：国家劳动力规划模式现代分配数据-拉拉大型培训前战略

from arxiv, Accepted by HP3C'22

Distributed deep learning is becoming increasingly popular due to the expanding demand for computing resources for deep learning models with a larger amount of parameters. Different from traditional training approaches, data-parallel training allows multiple compute nodes to train large deep learning models simultaneously in order to boost the training efficiency. In this paper, we present and compare six strategies for data-parallel training using PyTorch on the language model GPT-2 with 100M parameters using a qualitative approach. These strategies are Single GPU, Single Parameter Server, Distributed Parameter Server, Horovod, Distributed Parameter Server with Apex mixed-precision strategy, and Horovod with Apex mixed-precision strategy. We also analyze the quantitative experiment results from each strategy. In the end, we draw the conclusion that the Distributed Parameter Server with Apex mixedprecision strategy has the best performance on single node training, while Horovod with Apex is the most robust approach to use when we have single or multiple nodes.

翻译：与传统的培训方法不同,数据平行培训使多个计算节点能够同时培训大型深层学习模式,以提高培训效率。在本文中,我们介绍并比较了使用PyTorrch在语言模式GPT-2上的数据平行培训的六种战略,并使用定性方法将100M参数与语言模式GPT-2的数据平行培训的六种战略进行比较。这些战略是单一GPU、单参数服务器、分布式参数服务器、Horovod、带有Apex混合精度战略的分布式参数服务器和带有Apex混合精度战略的Horovod。我们还分析了每项战略的量化实验结果。最后,我们得出结论,用Apex混精度战略配置的分布式参数服务器在单节点培训上表现最佳,而使用Apex的Horovod是我们拥有单一或多个节点时最有力的方法。

0

相关内容

Horovod

Horovod是针对TensorFlow，Keras，PyTorch和MXNet的分布式培训框架。Horovod的目标是使分布式深度学习快速且易于使用。

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

基于SCADA数据挖掘的风电机组状态在线识别与预警

国家自然科学基金

3+阅读 · 2014年12月31日

线粒体电压依赖性阴离子通道蛋白调节足细胞炎症小体激活在糖尿病肾病中的致病机制

国家自然科学基金

0+阅读 · 2014年12月31日

iTRAQ技术分析钝叶草抗旱的分子机理

国家自然科学基金

0+阅读 · 2013年12月31日

IL-38对吸烟诱导的慢性阻塞性肺病(COPD)的免疫调控作用

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

前列腺增生雄激素通路hub基因(SOX9-AR-NFKB1)的miRNA调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

半夏泻心汤调节2型糖尿病人GLP-1和β细胞功能的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

面向遥感图像高保真压缩的变换与量化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

无线传感器/执行器网络的协同估计和协调控制

国家自然科学基金

0+阅读 · 2011年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Distributed Computations with Layered Resolution

Arxiv

0+阅读 · 2022年8月2日

MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment

Arxiv

1+阅读 · 2022年8月2日

An Exploratory Study of Documentation Strategies for Product Features in Popular GitHub Projects

Arxiv

0+阅读 · 2022年8月2日

Flood hazard model calibration using multiresolution model output

Arxiv

0+阅读 · 2022年8月1日

PatrickStar: Parallel Training of Pre-trained Models via Chunk-based Memory Management

Arxiv

0+阅读 · 2022年8月1日

Parameter-Parallel Distributed Variational Quantum Algorithm

Arxiv

0+阅读 · 2022年7月31日

A Learned Index for Exact Similarity Search in Metric Spaces

Arxiv

0+阅读 · 2022年7月29日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

相关论文

Distributed Computations with Layered Resolution

Arxiv

0+阅读 · 2022年8月2日

MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment

Arxiv

1+阅读 · 2022年8月2日

An Exploratory Study of Documentation Strategies for Product Features in Popular GitHub Projects

Arxiv

0+阅读 · 2022年8月2日

Flood hazard model calibration using multiresolution model output

Arxiv

0+阅读 · 2022年8月1日

PatrickStar: Parallel Training of Pre-trained Models via Chunk-based Memory Management

Arxiv

0+阅读 · 2022年8月1日

Parameter-Parallel Distributed Variational Quantum Algorithm

Arxiv

0+阅读 · 2022年7月31日

A Learned Index for Exact Similarity Search in Metric Spaces

Arxiv

0+阅读 · 2022年7月29日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

相关基金

基于SCADA数据挖掘的风电机组状态在线识别与预警

国家自然科学基金

3+阅读 · 2014年12月31日

线粒体电压依赖性阴离子通道蛋白调节足细胞炎症小体激活在糖尿病肾病中的致病机制

国家自然科学基金

0+阅读 · 2014年12月31日

iTRAQ技术分析钝叶草抗旱的分子机理

国家自然科学基金

0+阅读 · 2013年12月31日

IL-38对吸烟诱导的慢性阻塞性肺病(COPD)的免疫调控作用

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

前列腺增生雄激素通路hub基因(SOX9-AR-NFKB1)的miRNA调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

半夏泻心汤调节2型糖尿病人GLP-1和β细胞功能的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

面向遥感图像高保真压缩的变换与量化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

无线传感器/执行器网络的协同估计和协调控制

国家自然科学基金

0+阅读 · 2011年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员