规模化培训建议系统:通信效率模式和数据平行 (Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism) - 专知论文

会员服务 ·

0

可约的 · 数据并行 · MoDELS · 缩放 · Networking ·

2021 年 5 月 21 日

Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism

翻译：规模化培训建议系统:通信效率模式和数据平行

Vipul Gupta,Dhruv Choudhary,Ping Tak Peter Tang,Xiaohan Wei,Xing Wang,Yuzhen Huang,Arun Kejariwal,Kannan Ramchandran,Michael W. Mahoney

from arxiv, 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2021)

In this paper, we consider hybrid parallelism -- a paradigm that employs both Data Parallelism (DP) and Model Parallelism (MP) -- to scale distributed training of large recommendation models. We propose a compression framework called Dynamic Communication Thresholding (DCT) for communication-efficient hybrid training. DCT filters the entities to be communicated across the network through a simple hard-thresholding function, allowing only the most relevant information to pass through. For communication efficient DP, DCT compresses the parameter gradients sent to the parameter server during model synchronization. The threshold is updated only once every few thousand iterations to reduce the computational overhead of compression. For communication efficient MP, DCT incorporates a novel technique to compress the activations and gradients sent across the network during the forward and backward propagation, respectively. This is done by identifying and updating only the most relevant neurons of the neural network for each training sample in the data. We evaluate DCT on publicly available natural language processing and recommender models and datasets, as well as recommendation systems used in production at Facebook. DCT reduces communication by at least $100\times$ and $20\times$ during DP and MP, respectively. The algorithm has been deployed in production, and it improves end-to-end training time for a state-of-the-art industrial recommender model by 37\%, without any loss in performance.

翻译：在本文中,我们考虑混合平行 -- -- 一种使用数据平行(DP)和模型平行(MP)的范例 -- -- 以扩大大型建议模型的分布式培训规模。我们提议了一个压缩框架,称为动态通信推进(DCT),用于通信高效混合培训。DCT过滤整个网络中要通过简单的硬保存功能传递的实体,只允许最相关的信息通过。为了通信高效DP,DCT压缩模型同步期间发送到参数服务器的参数梯度。这个阈值每几千次更新一次,以减少压缩的计算间接成本。对于通信高效的MP,DCT采用了一种新型技术,以压缩在前向和后向传播期间发送的启动和梯度。这只能通过确定和更新数据中每个培训样本的神经网络中最相关的神经元来完成。我们评估了公开提供的自然语言模型处理和建议模型和数据集,以及在Facebook上使用的建议系统。 DCT将通信减少至少100美元和20美元的成本,DCT包含在DP和DM最后的动作中, 改进了它。

0

相关内容

可约的

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

专知会员服务

53+阅读 · 2020年8月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【阿姆斯特丹大学深度学习课程】《UvA Deep Learning Course》，阿姆斯特丹大学助理教授| Efstratios Gavves

【阿姆斯特丹大学深度学习课程】《UvA Deep Learning Course》，阿姆斯特丹大学助理教授| Efstratios Gavves

专知会员服务

20+阅读 · 2020年1月23日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

腾讯信息流内容理解技术实践，A User-Centered Concept Mining System for Query and Document Understanding at Tencent

腾讯信息流内容理解技术实践，A User-Centered Concept Mining System for Query and Document Understanding at Tencent

专知会员服务

40+阅读 · 2019年12月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【数据集】新的YELP数据集官方下载

【数据集】新的YELP数据集官方下载

机器学习研究会

16+阅读 · 2017年8月31日

Designing Recommender Systems to Depolarize

Arxiv

0+阅读 · 2021年7月11日

Blockumulus: A Scalable Framework for Smart Contracts on the Cloud

Arxiv

0+阅读 · 2021年7月10日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

FairRec: Two-Sided Fairness for Personalized Recommendations in Two-Sided Platforms

Arxiv

6+阅读 · 2020年2月25日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Recommendation Systems for Tourism Based on Social Networks: A Survey

Recommendation Systems for Tourism Based on Social Networks: A Survey

Arxiv

3+阅读 · 2019年3月28日

Regularized Singular Value Decomposition and Application to Recommender System

Arxiv

6+阅读 · 2018年4月13日

The Users' Perspective on the Privacy-Utility Trade-offs in Health Recommender Systems

Arxiv

5+阅读 · 2018年4月13日

Sequence-Aware Recommender Systems

Arxiv

8+阅读 · 2018年2月23日

Distributed Constraint Optimization Problems and Applications: A Survey

Arxiv

5+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

【KDD2020-Tutorial】自动推荐系统，Automated Recommendation System

专知会员服务

53+阅读 · 2020年8月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【阿姆斯特丹大学深度学习课程】《UvA Deep Learning Course》，阿姆斯特丹大学助理教授| Efstratios Gavves

【阿姆斯特丹大学深度学习课程】《UvA Deep Learning Course》，阿姆斯特丹大学助理教授| Efstratios Gavves

专知会员服务

20+阅读 · 2020年1月23日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

腾讯信息流内容理解技术实践，A User-Centered Concept Mining System for Query and Document Understanding at Tencent

腾讯信息流内容理解技术实践，A User-Centered Concept Mining System for Query and Document Understanding at Tencent

专知会员服务

40+阅读 · 2019年12月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【数据集】新的YELP数据集官方下载

【数据集】新的YELP数据集官方下载

机器学习研究会

16+阅读 · 2017年8月31日

相关论文

Designing Recommender Systems to Depolarize

Arxiv

0+阅读 · 2021年7月11日

Blockumulus: A Scalable Framework for Smart Contracts on the Cloud

Arxiv

0+阅读 · 2021年7月10日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

FairRec: Two-Sided Fairness for Personalized Recommendations in Two-Sided Platforms

Arxiv

6+阅读 · 2020年2月25日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

Recommendation Systems for Tourism Based on Social Networks: A Survey

Recommendation Systems for Tourism Based on Social Networks: A Survey

Arxiv

3+阅读 · 2019年3月28日

Regularized Singular Value Decomposition and Application to Recommender System

Arxiv

6+阅读 · 2018年4月13日

The Users' Perspective on the Privacy-Utility Trade-offs in Health Recommender Systems

Arxiv

5+阅读 · 2018年4月13日

Sequence-Aware Recommender Systems

Arxiv

8+阅读 · 2018年2月23日

Distributed Constraint Optimization Problems and Applications: A Survey

Arxiv

5+阅读 · 2018年1月11日

微信扫码咨询专知VIP会员