分布式矩阵乘法的不平等错误保护 (Straggler Mitigation through Unequal Error Protection for Distributed Matrix Multiplication) - 专知论文

会员服务 ·

0

线性的 · MINE · Performer · 查准率/准确率 · CASE ·

2021 年 3 月 19 日

Straggler Mitigation through Unequal Error Protection for Distributed Matrix Multiplication

翻译：分布式矩阵乘法的不平等错误保护

Busra Tegin,Eduin E. Hernandez,Stefano Rini,Tolga M. Duman

from arxiv, 6 pages, 6 figures

Large-scale machine learning and data mining methods routinely distribute computations across multiple agents to parallelize processing. The time required for computation at the agents is affected by the availability of local resources giving rise to the "straggler problem" in which the computation results are held back by unresponsive agents. For this problem, linear coding of the matrix sub-blocks can be used to introduce resilience toward straggling. The Parameter Server (PS) utilizes a channel code and distributes the matrices to the workers for multiplication. It then produces an approximation to the desired matrix multiplication using the results of the computations received at a given deadline. In this paper, we propose to employ Unequal Error Protection (UEP) codes to alleviate the straggler problem. The resiliency level of each sub-block is chosen according to its norm as blocks with larger norms have higher effects on the result of the matrix multiplication. We validate the effectiveness of our scheme both theoretically and through numerical evaluations. We derive a theoretical characterization of the performance of UEP using random linear codes, and compare it the case of equal error protection. We also apply the proposed coding strategy to the computation of the back-propagation step in the training of a Deep Neural Network (DNN), for which we investigate the fundamental trade-off between precision and the time required for the computations.

翻译：大型机器学习和数据挖掘方法通常在多个代理商之间分配计算结果,以便平行处理。代理商计算所需的时间受到当地资源供应情况的影响,导致计算结果被不反应代理商拖住的“累进器问题” 。对于这个问题,可以使用矩阵子块线性编码来引入螺旋变形的复原力。参数服务器(PS)使用一个频道代码,并将矩阵向工人分发,以便进行倍增。然后利用在给定期限收到的计算结果,对UEP的性能乘法进行近似。在本文件中,我们提议使用不均误差保护(UEP)代码来缓解累进器问题。每个子块的弹性水平可以根据其规范选择,因为具有较大规范的区块对矩阵倍增效果有更大的影响。我们用随机线性代码和数字评估来验证我们的计划的有效性。我们用随机线性代码对UEP的性能进行理论描述,并比较相同的错误保护情况。我们还建议在深度计算中采用深度计算系统(我们要求的深度计算系统)的精确度战略,用于进行深度计算。

0

相关内容

线性的

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

专知会员服务

27+阅读 · 2020年3月26日

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

专知会员服务

116+阅读 · 2020年2月10日

【资源推荐】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》by Afshine Amidi, Shervine Amidi

【资源推荐】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》by Afshine Amidi, Shervine Amidi

专知会员服务

27+阅读 · 2019年12月19日

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

专知会员服务

16+阅读 · 2019年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

重磅！最新目标检测网络：Matrix Net，47.8 mAP！速度提高3倍！

重磅！最新目标检测网络：Matrix Net，47.8 mAP！速度提高3倍！

极市平台

10+阅读 · 2019年8月17日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

神经网络训练tricks

神经网络训练tricks

极市平台

6+阅读 · 2019年4月15日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

美国化学会 (ACS) 北京代表处招聘

美国化学会 (ACS) 北京代表处招聘

知社学术圈

11+阅读 · 2018年9月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Variable Coded Batch Matrix Multiplication

Arxiv

0+阅读 · 2021年5月13日

Distributed Source Coding with Encryption Using Correlated Keys

Arxiv

0+阅读 · 2021年5月12日

An Efficient Matrix Multiplication with Enhanced Privacy Protection in Cloud Computing and Its Applications

Arxiv

0+阅读 · 2021年5月12日

Wireless Covert Communications Aided by Distributed Cooperative Jamming over Slow Fading Channels

Arxiv

0+阅读 · 2021年5月12日

Modulated Sparse Superposition Codes for the Complex AWGN Channel

Arxiv

0+阅读 · 2021年5月11日

Optimal Sampling Algorithms for Block Matrix Multiplication

Arxiv

0+阅读 · 2021年5月11日

Coded Alternating Least Squares for Straggler Mitigation in Distributed Recommendations

Arxiv

0+阅读 · 2021年5月8日

Download time analysis for distributed storage systems with node failures

Arxiv

0+阅读 · 2021年5月6日

Multiparty Interactive Coding over Networks of Intersecting Broadcast Links

Arxiv

0+阅读 · 2021年5月4日

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Arxiv

11+阅读 · 2020年2月18日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

专知会员服务

27+阅读 · 2020年3月26日

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

【WWW2020-MAGNN】异质图嵌入的集合图神经网络 MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

专知会员服务

116+阅读 · 2020年2月10日

【资源推荐】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》by Afshine Amidi, Shervine Amidi

【资源推荐】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》by Afshine Amidi, Shervine Amidi

专知会员服务

27+阅读 · 2019年12月19日

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

专知会员服务

16+阅读 · 2019年10月31日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

重磅！最新目标检测网络：Matrix Net，47.8 mAP！速度提高3倍！

重磅！最新目标检测网络：Matrix Net，47.8 mAP！速度提高3倍！

极市平台

10+阅读 · 2019年8月17日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

神经网络训练tricks

神经网络训练tricks

极市平台

6+阅读 · 2019年4月15日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

美国化学会 (ACS) 北京代表处招聘

美国化学会 (ACS) 北京代表处招聘

知社学术圈

11+阅读 · 2018年9月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Variable Coded Batch Matrix Multiplication

Arxiv

0+阅读 · 2021年5月13日

Distributed Source Coding with Encryption Using Correlated Keys

Arxiv

0+阅读 · 2021年5月12日

An Efficient Matrix Multiplication with Enhanced Privacy Protection in Cloud Computing and Its Applications

Arxiv

0+阅读 · 2021年5月12日

Wireless Covert Communications Aided by Distributed Cooperative Jamming over Slow Fading Channels

Arxiv

0+阅读 · 2021年5月12日

Modulated Sparse Superposition Codes for the Complex AWGN Channel

Arxiv

0+阅读 · 2021年5月11日

Optimal Sampling Algorithms for Block Matrix Multiplication

Arxiv

0+阅读 · 2021年5月11日

Coded Alternating Least Squares for Straggler Mitigation in Distributed Recommendations

Arxiv

0+阅读 · 2021年5月8日

Download time analysis for distributed storage systems with node failures

Arxiv

0+阅读 · 2021年5月6日

Multiparty Interactive Coding over Networks of Intersecting Broadcast Links

Arxiv

0+阅读 · 2021年5月4日

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Arxiv

11+阅读 · 2020年2月18日

微信扫码咨询专知VIP会员