Computing the product of two sparse matrices (SpGEMM) is a fundamental operation in various combinatorial and graph algorithms as well as various bioinformatics and data analytics applications for computing inner-product similarities. For an important class of algorithms, only a subset of the output entries are needed, and the resulting operation is known as Masked SpGEMM since a subset of the output entries is considered to be "masked out". Existing algorithms for Masked SpGEMM usually do not consider mask as part of multiplication and either first compute a regular SpGEMM followed by masking, or perform a sparse inner product only for output elements that are not masked out. In this work, we investigate various novel algorithms and data structures for this rather challenging and important computation, and provide guidelines on how to design a fast Masked-SpGEMM for shared-memory architectures. Our evaluations show that factors such as matrix and mask density, mask structure and cache behavior play a vital role in attaining high performance for Masked SpGEMM. We evaluate our algorithms on a large number of matrices using several real-world benchmarks and show that our algorithms in most cases significantly outperform the state of the art for Masked SpGEMM implementations.
翻译:计算两个稀薄的矩阵( SpGEMM) 的产物( SpGEMM) 是各种组合和图表算法以及各种生物信息学和数据分析应用中的一项基本操作, 用于计算内产物相似性。 对于重要的算法类别来说, 只需要一个产出条目的子集, 由此产生的操作被称为“ 蒙面 SpGEMM ”, 因为输出条目的子集被视为“ 外层 ” 。 蒙面 SpGEM 的现有算法通常不把遮罩视为乘法的一部分, 通常不认为遮罩为乘法中的一部分, 并且首先对普通 SpGEMM 进行涂层, 或者只对未遮盖的产出元素执行一种稀薄的内部产品。 在这项工作中, 我们为这个相当具有挑战性和重要性的计算, 我们调查了各种新颖的算法和数据结构结构, 并提供了如何设计一个快速蒙面的 SpGEMM 用于共享模版结构。 我们的评估显示, 矩阵和遮罩密度、 遮罩结构和缓度行为等因素在为 蒙面 SpGEMMMMMMM 的高级执行中起到关键的作用。 我们用多个矩阵来评估大量的大小矩阵分析。