The Sparse GEneral Matrix-Matrix multiplication (SpGEMM) $C = A \times B$ is a fundamental routine extensively used in domains like machine learning or graph analytics. Despite its relevance, the efficient execution of SpGEMM on vector architectures is a relatively unexplored topic. The most recent algorithm to run SpGEMM on these architectures is based on the SParse Accumulator (SPA) approach, and it is relatively efficient for sparse matrices featuring several tens of non-zero coefficients per column as it computes C columns one by one. However, when dealing with matrices containing just a few non-zero coefficients per column, the state-of-the-art algorithm is not able to fully exploit long vector architectures when computing the SpGEMM kernel. To overcome this issue we propose the SPA paRallel with Sorting (SPARS) algorithm, which computes in parallel several C columns among other optimizations, and the HASH algorithm, which uses dynamically sized hash tables to store intermediate output values. To combine the efficiency of SPA for relatively dense matrix blocks with the high performance that SPARS and HASH deliver for very sparse matrix blocks we propose H-SPA(t) and H-HASH(t), which dynamically switch between different algorithms. H-SPA(t) and H-HASH(t) obtain 1.24$\times$ and 1.57$\times$ average speed-ups with respect to SPA respectively, over a set of 40 sparse matrices obtained from the SuiteSparse Matrix Collection. For the 22 most sparse matrices, H-SPA(t) and H-HASH(t) deliver 1.42$\times$ and 1.99$\times$ average speed-ups respectively.
翻译:Sparse General 矩阵- Matrix 乘法 (SpGEMM) $42 = A = A 计数 C 列乘以 1 乘以 C 列时, 以零系数计数 。 但是, 处理仅包含几部非零 的 机器学习 或 图形分析 等域的基本常规 B$ 。 尽管相关, SpGEM 在矢量结构中高效执行 SpGEM 是一个相对未探索的专题。 运行 SpGEMM 在这些结构中运行 SpGEM 的最近算法基于 SParse Across 累积( SPARM ) 方法, 而对于以几部非零系数计算每列数个非零系数的稀释矩阵来说相对有效。 然而, 当处理仅包含几部非零位的 IMFlexcal 的矩阵时, 状态算法不能完全利用长期矢量的 SVAS- sal- sal- sal- sal- sal- sal lavedal 和 SH- sal- h- h- sal- sal- sal- h- sal- sal- sal- sal- sal- h- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- 和 和 和 和 和 和 和 和 和 和 sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- s- sal- s- s- sal- sal- sal- sal- sal- sal- sal- sal- s</s>