We develop a family of parallel algorithms for the SpKAdd operation that adds a collection of k sparse matrices. SpKAdd is a much needed operation in many applications including distributed memory sparse matrix-matrix multiplication (SpGEMM), streaming accumulations of graphs, and algorithmic sparsification of the gradient updates in deep learning. While adding two sparse matrices is a common operation in Matlab, Python, Intel MKL, and various GraphBLAS libraries, these implementations do not perform well when adding a large collection of sparse matrices. We develop a series of algorithms using tree merging, heap, sparse accumulator, hash table, and sliding hash table data structures. Among them, hash-based algorithms attain the theoretical lower bounds both on the computational and I/O complexities and perform the best in practice. The newly-developed hash SpKAdd makes the computation of a distributed-memory SpGEMM algorithm at least 2x faster than that the previous state-of-the-art algorithms.
翻译:我们为 SpKAdd 操作开发了一套平行算法, 增加了一个 k smiss 矩阵集。 SpKAdd 在许多应用程序中是一个非常需要的操作, 包括分布式的记忆稀薄矩阵矩阵矩阵乘数( SpGEMM ) 、 图表的串流积累, 以及深层学习中梯度更新的算法宽化。 虽然在 Matlab 、 Python、 Intel MKL 和各种 GreabBLAS 库中增加两个稀释矩阵是常见操作, 但是在添加大量稀释矩阵时, 这些操作效果并不好。 我们开发了一系列算法, 使用树状合并、 heap、 稀散的蓄积器、 hash 表格和 滑动的 hash 表格数据结构。 其中, 基于 hash 的算法在计算和 I/ O 复杂性上都达到了理论下较低的边框, 并在实际中进行最佳操作 。 新开发的 hash SpKAdd 使分布式 SpekM 算法的计算速度至少 2x 。