蒙面散射矩阵矩阵- 母体产品的平行算法 (Parallel Algorithms for Masked Sparse Matrix-Matrix Products) - 专知论文

会员服务 ·

0

掩码 · Performer · 稀疏 · Bioinformatics · 输出 ·

2021 年 11 月 18 日

Parallel Algorithms for Masked Sparse Matrix-Matrix Products

翻译：蒙面散射矩阵矩阵- 母体产品的平行算法

Srđan Milaković,Oguz Selvitopi,Israt Nisa,Zoran Budimlić,Aydin Buluc

Computing the product of two sparse matrices (SpGEMM) is a fundamental operation in various combinatorial and graph algorithms as well as various bioinformatics and data analytics applications for computing inner-product similarities. For an important class of algorithms, only a subset of the output entries are needed, and the resulting operation is known as Masked SpGEMM since a subset of the output entries is considered to be "masked out". Existing algorithms for Masked SpGEMM usually do not consider mask as part of multiplication and either first compute a regular SpGEMM followed by masking, or perform a sparse inner product only for output elements that are not masked out. In this work, we investigate various novel algorithms and data structures for this rather challenging and important computation, and provide guidelines on how to design a fast Masked-SpGEMM for shared-memory architectures. Our evaluations show that factors such as matrix and mask density, mask structure and cache behavior play a vital role in attaining high performance for Masked SpGEMM. We evaluate our algorithms on a large number of matrices using several real-world benchmarks and show that our algorithms in most cases significantly outperform the state of the art for Masked SpGEMM implementations.

翻译：计算两个稀薄的矩阵( SpGEMM) 的产物( SpGEMM) 是各种组合和图表算法以及各种生物信息学和数据分析应用中的一项基本操作, 用于计算内产物相似性。对于重要的算法类别来说, 只需要一个产出条目的子集, 由此产生的操作被称为“ 蒙面 SpGEMM ”, 因为输出条目的子集被视为“ 外层 ” 。蒙面 SpGEM 的现有算法通常不把遮罩视为乘法的一部分, 通常不认为遮罩为乘法中的一部分, 并且首先对普通 SpGEMM 进行涂层, 或者只对未遮盖的产出元素执行一种稀薄的内部产品。在这项工作中, 我们为这个相当具有挑战性和重要性的计算, 我们调查了各种新颖的算法和数据结构结构, 并提供了如何设计一个快速蒙面的 SpGEMM 用于共享模版结构。我们的评估显示, 矩阵和遮罩密度、遮罩结构和缓度行为等因素在为蒙面 SpGEMMMMMMM 的高级执行中起到关键的作用。我们用多个矩阵来评估大量的大小矩阵分析。

0

相关内容

【NeurIPS 2021 】MST: 用于Transformer视觉表征的Masked自监督解读

【NeurIPS 2021 】MST: 用于Transformer视觉表征的Masked自监督解读

专知会员服务

42+阅读 · 2021年12月11日

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

105+阅读 · 2021年10月30日

【ICML2021】蛋白质语言模型-MSA Transformer

专知会员服务

34+阅读 · 2021年8月16日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

专知会员服务

82+阅读 · 2020年4月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

神器Cobalt Strike3.13破解版

神器Cobalt Strike3.13破解版

黑白之道

12+阅读 · 2019年3月1日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Efficient Approximations of the Fisher Matrix in Neural Networks using Kronecker Product Singular Value Decomposition

Arxiv

0+阅读 · 2022年1月25日

Scheduling Policies for Stability and Optimal Server Running Cost in Cloud Computing Platforms

Arxiv

0+阅读 · 2022年1月22日

Faster ILP Algorithms for Problems with Sparse Matrices and Their Applications to Multipacking and Multicover Problems in Graphs and Hypergraphs

Arxiv

0+阅读 · 2022年1月22日

Non-negative matrix factorization algorithms greatly improve topic model fits

Arxiv

0+阅读 · 2022年1月21日

Dictionary-based Low-Rank Approximations and the Mixed Sparse Coding problem

Arxiv

0+阅读 · 2022年1月21日

Low-Rank Sinkhorn Factorization

Arxiv

9+阅读 · 2021年3月8日

Polya Urn Latent Dirichlet Allocation: a doubly sparse massively parallel sampler

Arxiv

3+阅读 · 2018年4月23日

ParVecMF: A Paragraph Vector-based Matrix Factorization Recommender System

Arxiv

9+阅读 · 2018年1月10日

Negative Binomial Matrix Factorization for Recommender Systems

Arxiv

8+阅读 · 2018年1月5日

Practical sketching algorithms for low-rank matrix approximation

Arxiv

4+阅读 · 2018年1月2日

VIP会员

文章信息

相关主题

相关VIP内容

【NeurIPS 2021 】MST: 用于Transformer视觉表征的Masked自监督解读

【NeurIPS 2021 】MST: 用于Transformer视觉表征的Masked自监督解读

专知会员服务

42+阅读 · 2021年12月11日

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

【2021新书】并行高性能计算，705页pdf，Parallel and High Performance Computing

专知会员服务

105+阅读 · 2021年10月30日

【ICML2021】蛋白质语言模型-MSA Transformer

专知会员服务

34+阅读 · 2021年8月16日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

【序列推荐系统:挑战、进展和展望】Sequential Recommender Systems

专知会员服务

82+阅读 · 2020年4月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

神器Cobalt Strike3.13破解版

神器Cobalt Strike3.13破解版

黑白之道

12+阅读 · 2019年3月1日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Efficient Approximations of the Fisher Matrix in Neural Networks using Kronecker Product Singular Value Decomposition

Arxiv

0+阅读 · 2022年1月25日

Scheduling Policies for Stability and Optimal Server Running Cost in Cloud Computing Platforms

Arxiv

0+阅读 · 2022年1月22日

Faster ILP Algorithms for Problems with Sparse Matrices and Their Applications to Multipacking and Multicover Problems in Graphs and Hypergraphs

Arxiv

0+阅读 · 2022年1月22日

Non-negative matrix factorization algorithms greatly improve topic model fits

Arxiv

0+阅读 · 2022年1月21日

Dictionary-based Low-Rank Approximations and the Mixed Sparse Coding problem

Arxiv

0+阅读 · 2022年1月21日

Low-Rank Sinkhorn Factorization

Arxiv

9+阅读 · 2021年3月8日

Polya Urn Latent Dirichlet Allocation: a doubly sparse massively parallel sampler

Arxiv

3+阅读 · 2018年4月23日

ParVecMF: A Paragraph Vector-based Matrix Factorization Recommender System

Arxiv

9+阅读 · 2018年1月10日

Negative Binomial Matrix Factorization for Recommender Systems

Arxiv

8+阅读 · 2018年1月5日

Practical sketching algorithms for low-rank matrix approximation

Arxiv

4+阅读 · 2018年1月2日

微信扫码咨询专知VIP会员