THC: 利用Tensor同形态压缩加速分配深层学习 (THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression) - 专知论文

会员服务 ·

0

Tensor · 模型评估 · Learning · 可交换的 · 图片分类 ·

2023 年 2 月 16 日

THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression

翻译：THC: 利用Tensor同形态压缩加速分配深层学习

Minghao Li,Ran Ben Basat,Shay Vargaftik,ChonLam Lao,Kevin Xu,Xinran Tang,Michael Mitzenmacher,Minlan Yu

from arxiv, 13 pages body, 18 pages total

Deep neural networks (DNNs) are the de-facto standard for essential use cases, such as image classification, computer vision, and natural language processing. As DNNs and datasets get larger, they require distributed training on increasingly larger clusters. A main bottleneck is then the resulting communication overhead where workers exchange model updates (i.e., gradients) on a per-round basis. To address this bottleneck and accelerate training, a widely-deployed approach is compression. However, previous deployments often apply bi-directional compression schemes by simply using a uni-directional gradient compression scheme in each direction. This results in significant computational overheads at the parameter server and increased compression error, leading to longer training and lower accuracy. We introduce Tensor Homomorphic Compression (THC), a novel bi-directional compression framework that enables the direct aggregation of compressed values while optimizing the bandwidth to accuracy tradeoff, thus eliminating the aforementioned overheads. Moreover, THC is compatible with in-network aggregation (INA), which allows for further acceleration. Evaluation over a testbed shows that THC improves time-to-accuracy in comparison to alternatives by up to 1.32x with a software PS and up to 1.51x using INA. Finally, we demonstrate that THC is scalable and tolerant for acceptable packet-loss rates.

翻译：深神经网络(DNNS)是基本使用案例(如图像分类、计算机视觉和自然语言处理等)的离地标准。随着DNNS和数据集的扩大,它们需要分布式培训,以扩大集群。然后,主要的瓶颈就是由此产生的通信间接费用,工人可以全面交换模型更新(如梯度),这是一个新的双向压缩框架,使压缩值能够直接组合,同时优化带宽以达到准确交易,从而消除上述间接费用。此外,THC与网络内集成相容,从而可以进一步加速。对参数服务器进行的重大计算间接费用和增加压缩错误,导致培训时间更长,准确度降低。我们引入了Tensor单向单向单向式更新模型更新(即梯度),这是一个新的双向压缩框架,使压缩值能够直接组合,同时优化带宽以达到准确交易,从而消除上述间接费用。此外,THCHC与网络内集成(INA)兼容,从而可以进一步加速。通过测试床评估显示,THCS-HCS-S-S-CFS-S-S-S-CLS-C-S-CLVAT-C-C-C-C-S-S-S-C-C-C-S-CLVleventral 和S-C-C-S-C-C-S-S-S-S-C-C-S-S-S-S-S-S-C-S-S-S-CAR-CLis-S-S-S-S-S-S-S-CLis-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-C-C-C-C-C-C-CL-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-S-C-C-C-C-C-C-C-C-C-C-

0

相关内容

Tensor

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

GSK-3β抑制PP2A甲酯酶表达的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Periostin-avβ3-FAK-PI3K通路在褐藻糖胶抗乳腺癌转移中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

TIPE2诱导自噬的信号通路及其对巨噬细胞功能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

基于分数阶傅里叶变换的非平稳信号分析和处理

国家自然科学基金

0+阅读 · 2012年12月31日

红闪（sprite）对地闪的非线性响应与干涉效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

非光滑近Hamilton系统全局动力学特性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

以三联吡啶类杯芳烃为配体构筑功能性分子笼或分子环自组装体系

国家自然科学基金

0+阅读 · 2012年12月31日

无线Ad Hoc网络拓扑攻击探测与定位研究

国家自然科学基金

0+阅读 · 2011年12月31日

表面等离子体环腔模式特性及其非线性全光调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

TREBUCHET: Fully Homomorphic Encryption Accelerator for Deep Computation

Arxiv

0+阅读 · 2023年4月11日

Accelerating Evolution Through Gene Masking and Distributed Search

Arxiv

0+阅读 · 2023年4月10日

An autoencoder compression approach for accelerating large-scale inverse problems

Arxiv

0+阅读 · 2023年4月10日

Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning

Arxiv

0+阅读 · 2023年4月10日

Gradient Sparsification for Efficient Wireless Federated Learning with Differential Privacy

Arxiv

0+阅读 · 2023年4月9日

Efficient Secure Aggregation for Privacy-Preserving Federated Machine Learning

Arxiv

1+阅读 · 2023年4月7日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

相关论文

TREBUCHET: Fully Homomorphic Encryption Accelerator for Deep Computation

Arxiv

0+阅读 · 2023年4月11日

Accelerating Evolution Through Gene Masking and Distributed Search

Arxiv

0+阅读 · 2023年4月10日

An autoencoder compression approach for accelerating large-scale inverse problems

Arxiv

0+阅读 · 2023年4月10日

Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning

Arxiv

0+阅读 · 2023年4月10日

Gradient Sparsification for Efficient Wireless Federated Learning with Differential Privacy

Arxiv

0+阅读 · 2023年4月9日

Efficient Secure Aggregation for Privacy-Preserving Federated Machine Learning

Arxiv

1+阅读 · 2023年4月7日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

相关基金

GSK-3β抑制PP2A甲酯酶表达的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Periostin-avβ3-FAK-PI3K通路在褐藻糖胶抗乳腺癌转移中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

TIPE2诱导自噬的信号通路及其对巨噬细胞功能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

基于分数阶傅里叶变换的非平稳信号分析和处理

国家自然科学基金

0+阅读 · 2012年12月31日

红闪（sprite）对地闪的非线性响应与干涉效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

非光滑近Hamilton系统全局动力学特性的研究

国家自然科学基金

0+阅读 · 2012年12月31日

以三联吡啶类杯芳烃为配体构筑功能性分子笼或分子环自组装体系

国家自然科学基金

0+阅读 · 2012年12月31日

无线Ad Hoc网络拓扑攻击探测与定位研究

国家自然科学基金

0+阅读 · 2011年12月31日

表面等离子体环腔模式特性及其非线性全光调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员