添加垃圾矩阵收藏的平行值 (Parallel Algorithms for Adding a Collection of Sparse Matrices) - 专知论文

会员服务 ·

0

稀疏 · Performer · 哈希学习 · 稀疏化 · state-of-the-art ·

2021 年 12 月 19 日

Parallel Algorithms for Adding a Collection of Sparse Matrices

翻译：添加垃圾矩阵收藏的平行值

Md Taufique Hussain,Guttu Sai Abhishek,Aydin Buluç,Ariful Azad

We develop a family of parallel algorithms for the SpKAdd operation that adds a collection of k sparse matrices. SpKAdd is a much needed operation in many applications including distributed memory sparse matrix-matrix multiplication (SpGEMM), streaming accumulations of graphs, and algorithmic sparsification of the gradient updates in deep learning. While adding two sparse matrices is a common operation in Matlab, Python, Intel MKL, and various GraphBLAS libraries, these implementations do not perform well when adding a large collection of sparse matrices. We develop a series of algorithms using tree merging, heap, sparse accumulator, hash table, and sliding hash table data structures. Among them, hash-based algorithms attain the theoretical lower bounds both on the computational and I/O complexities and perform the best in practice. The newly-developed hash SpKAdd makes the computation of a distributed-memory SpGEMM algorithm at least 2x faster than that the previous state-of-the-art algorithms.

翻译：我们为 SpKAdd 操作开发了一套平行算法, 增加了一个 k smiss 矩阵集。 SpKAdd 在许多应用程序中是一个非常需要的操作, 包括分布式的记忆稀薄矩阵矩阵矩阵乘数( SpGEMM ) 、图表的串流积累, 以及深层学习中梯度更新的算法宽化。虽然在 Matlab 、 Python、 Intel MKL 和各种 GreabBLAS 库中增加两个稀释矩阵是常见操作, 但是在添加大量稀释矩阵时, 这些操作效果并不好。我们开发了一系列算法, 使用树状合并、 heap、稀散的蓄积器、 hash 表格和滑动的 hash 表格数据结构。其中, 基于 hash 的算法在计算和 I/ O 复杂性上都达到了理论下较低的边框, 并在实际中进行最佳操作。新开发的 hash SpKAdd 使分布式 SpekM 算法的计算速度至少 2x 。

0

相关内容

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

105+阅读 · 2021年10月30日

【微软】强化学习系统，37页ppt

专知会员服务

40+阅读 · 2021年6月29日

【微软】自动机器学习系统，70页ppt

【微软】自动机器学习系统，70页ppt

专知会员服务

72+阅读 · 2021年6月28日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【百度】-大规模深度学习广告系统的分布式分层GPU参数服务器，Distributed Hierarchical GPU PS

专知会员服务

24+阅读 · 2020年3月15日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

已删除

将门创投

5+阅读 · 2019年4月29日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

Sparse Orthogonal Variational Inference for Gaussian Processes

Arxiv

0+阅读 · 2022年2月22日

Classification of Planar Monomials Over Finite Fields of Small Order

Arxiv

0+阅读 · 2022年2月22日

Distilled Neural Networks for Efficient Learning to Rank

Arxiv

0+阅读 · 2022年2月22日

A variational technique of mollification applied to backward heat conduction problems

A variational technique of mollification applied to backward heat conduction problems

Arxiv

0+阅读 · 2022年2月21日

Fast Dynamic Updates and Dynamic SpGEMM on MPI-Distributed Graphs

Arxiv

0+阅读 · 2022年2月19日

Distributed Out-of-Memory NMF of Dense and Sparse Data on CPU/GPU Architectures with Automatic Model Selection for Exascale Data

Arxiv

0+阅读 · 2022年2月19日

Multiplying Matrices Without Multiplying

Arxiv

9+阅读 · 2021年6月21日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

Practical sketching algorithms for low-rank matrix approximation

Arxiv

4+阅读 · 2018年1月2日

Optimal Algorithms for Distributed Optimization

Arxiv

3+阅读 · 2017年12月1日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

105+阅读 · 2021年10月30日

【微软】强化学习系统，37页ppt

专知会员服务

40+阅读 · 2021年6月29日

【微软】自动机器学习系统，70页ppt

【微软】自动机器学习系统，70页ppt

专知会员服务

72+阅读 · 2021年6月28日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【百度】-大规模深度学习广告系统的分布式分层GPU参数服务器，Distributed Hierarchical GPU PS

专知会员服务

24+阅读 · 2020年3月15日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

已删除

将门创投

5+阅读 · 2019年4月29日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

相关论文

Sparse Orthogonal Variational Inference for Gaussian Processes

Arxiv

0+阅读 · 2022年2月22日

Classification of Planar Monomials Over Finite Fields of Small Order

Arxiv

0+阅读 · 2022年2月22日

Distilled Neural Networks for Efficient Learning to Rank

Arxiv

0+阅读 · 2022年2月22日

A variational technique of mollification applied to backward heat conduction problems

A variational technique of mollification applied to backward heat conduction problems

Arxiv

0+阅读 · 2022年2月21日

Fast Dynamic Updates and Dynamic SpGEMM on MPI-Distributed Graphs

Arxiv

0+阅读 · 2022年2月19日

Distributed Out-of-Memory NMF of Dense and Sparse Data on CPU/GPU Architectures with Automatic Model Selection for Exascale Data

Arxiv

0+阅读 · 2022年2月19日

Multiplying Matrices Without Multiplying

Arxiv

9+阅读 · 2021年6月21日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

Practical sketching algorithms for low-rank matrix approximation

Arxiv

4+阅读 · 2018年1月2日

Optimal Algorithms for Distributed Optimization

Arxiv

3+阅读 · 2017年12月1日

微信扫码咨询专知VIP会员