Libra: 加速分布式散射深层培训的网络内渐变聚合 (Libra: In-network Gradient Aggregation for Speeding up Distributed Sparse Deep Training) - 专知论文

会员服务 ·

0

稀疏 · Extensibility · Performance · Performer · 讲稿 ·

2022 年 5 月 11 日

Libra: In-network Gradient Aggregation for Speeding up Distributed Sparse Deep Training

翻译：Libra: 加速分布式散射深层培训的网络内渐变聚合

Heng Pan,Penglai Cui,Zhenyu li,Ru Jia,Penghao Zhang,Leilei Zhang,Ye Yang,Jiahao Wu,Jianbo Dong,Zheng Cao,Qiang Li,Hongqiang Harry Liu,Mathy Laurent,Gaogang Xie

from arxiv, 14 pages, 18 figures

Distributed sparse deep learning has been widely used in many internet-scale applications. Network communication is one of the major hurdles for the training performance. In-network gradient aggregation on programmable switches is a promising solution to speed up the performance. Nevertheless,existing in-network aggregation solutions are designed for the distributed dense deep training, and fall short when used for the sparse deep training.To address this gap, we present Libra based on our key observation of the extremely biased update frequency of parameters in distributed deep sparse training. Specifically, Libra offloads only the aggregation for "hot" parameters that are updated frequently onto programmable switches. To enable this offloading and achieve high aggregation throughput, we propose solutions to address the challenges related to hot parameter identification, parameter orchestration, floating-point summation on switches as well as system reliability. We implemented Libra on Intel Tofino switches and integrated it with PS-lite. Finally, we evaluate Libra's performance through extensive experiments and show that Libra can speed up the gradient aggregation by 1.5~4 times.

翻译：许多互联网应用中广泛使用分散的深层次学习方法。网络通信是培训工作的主要障碍之一。网络内可编程开关的梯度汇总是加速性能的一个大有希望的解决办法。尽管如此, 网络内汇总解决方案是为分散的密集深层培训设计的, 而用于稀薄深层培训时则显得不足。为了弥补这一差距, 我们根据我们对分布式深深度培训中极有偏差的更新参数频率的主要观察, 向利布拉展示了利布拉。具体来说, 利布拉只卸载经常更新到可编程开关的“ 热” 参数的汇总。为了能够卸载并实现高集速传输, 我们提出了解决与热参数识别、参数协调、开关的浮点和系统可靠性有关的挑战的解决方案。我们在英特尔托菲诺开关上实施了利布拉, 并将它与PS- lite 整合为一体。最后, 我们通过广泛的实验来评估利布拉的表现, 并显示利布拉能够将梯度聚合速度加速1.5~4倍。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

自组装聚合物/氧化硅-离子液体复合膜在质子交换膜燃料电池中性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

混凝土Weibull统计尺寸效应理论模型改进研究

国家自然科学基金

0+阅读 · 2013年12月31日

马尔可夫过程在Girsanov变换下的性质及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

热致相分离法调控三元导电高分子纳米复合材料的形态、结构与性能

国家自然科学基金

0+阅读 · 2012年12月31日

低强度超声促进关节软骨修复的物理机制及优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

分子内与分子间CT过程协同增强光电转换效率的理论与实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

船舶“设备-基座”变参数系统复杂振动机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

半互穿网络聚合物纳米粒子改性聚醚砜膜的研究

国家自然科学基金

0+阅读 · 2009年12月31日

肿瘤细胞EGFR靶向的双功能免疫纳米胶束用于肿瘤MRI检测及药物治疗的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning

Arxiv

0+阅读 · 2022年6月30日

Low-resource Accent Classification in Geographically-proximate Settings: A Forensic and Sociophonetics Perspective

Low-resource Accent Classification in Geographically-proximate Settings: A Forensic and Sociophonetics Perspective

Arxiv

0+阅读 · 2022年6月29日

In-network Computation for Large-scale Federated Learning over Wireless Edge Networks

Arxiv

0+阅读 · 2022年6月28日

Deep Neural Networks pruning via the Structured Perspective Regularization

Arxiv

0+阅读 · 2022年6月28日

Fundamental Limits of Communication Efficiency for Model Aggregation in Distributed Learning: A Rate-Distortion Approach

Fundamental Limits of Communication Efficiency for Model Aggregation in Distributed Learning: A Rate-Distortion Approach

Arxiv

0+阅读 · 2022年6月28日

Disentangling Embedding Spaces with Minimal Distributional Assumptions

Arxiv

0+阅读 · 2022年6月28日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning

Arxiv

0+阅读 · 2022年6月30日

Low-resource Accent Classification in Geographically-proximate Settings: A Forensic and Sociophonetics Perspective

Low-resource Accent Classification in Geographically-proximate Settings: A Forensic and Sociophonetics Perspective

Arxiv

0+阅读 · 2022年6月29日

In-network Computation for Large-scale Federated Learning over Wireless Edge Networks

Arxiv

0+阅读 · 2022年6月28日

Deep Neural Networks pruning via the Structured Perspective Regularization

Arxiv

0+阅读 · 2022年6月28日

Fundamental Limits of Communication Efficiency for Model Aggregation in Distributed Learning: A Rate-Distortion Approach

Fundamental Limits of Communication Efficiency for Model Aggregation in Distributed Learning: A Rate-Distortion Approach

Arxiv

0+阅读 · 2022年6月28日

Disentangling Embedding Spaces with Minimal Distributional Assumptions

Arxiv

0+阅读 · 2022年6月28日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

Arxiv

15+阅读 · 2020年7月1日

相关基金

自组装聚合物/氧化硅-离子液体复合膜在质子交换膜燃料电池中性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

混凝土Weibull统计尺寸效应理论模型改进研究

国家自然科学基金

0+阅读 · 2013年12月31日

马尔可夫过程在Girsanov变换下的性质及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

热致相分离法调控三元导电高分子纳米复合材料的形态、结构与性能

国家自然科学基金

0+阅读 · 2012年12月31日

低强度超声促进关节软骨修复的物理机制及优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

分子内与分子间CT过程协同增强光电转换效率的理论与实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

船舶“设备-基座”变参数系统复杂振动机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

半互穿网络聚合物纳米粒子改性聚醚砜膜的研究

国家自然科学基金

0+阅读 · 2009年12月31日

肿瘤细胞EGFR靶向的双功能免疫纳米胶束用于肿瘤MRI检测及药物治疗的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员