Graph analytics techniques based on spectral methods process extremely large sparse matrices with millions or even billions of non-zero values. Behind these algorithms lies the Top-K sparse eigenproblem, the computation of the largest eigenvalues and their associated eigenvectors. In this work, we leverage GPUs to scale the Top-K sparse eigenproblem to bigger matrices than previously achieved while also providing state-of-the-art execution times. We can transparently partition the computation across multiple GPUs, process out-of-core matrices, and tune precision and execution time using mixed-precision floating-point arithmetic. Overall, we are 67 times faster than the highly optimized ARPACK library running on a 104-thread CPU and 1.9 times than a recent FPGA hardware design. We also determine how mixed-precision floating-point arithmetic improves execution time by 50% over double-precision, and is 12 times more accurate than single-precision floating-point arithmetic.
翻译:基于光谱方法的图形分析技术, 以光谱方法处理极其稀少的矩阵, 其值为百万甚至数十亿非零值。 这些算法背后的算法是最高- K 稀疏的半基因问题, 计算最大的电子元值及其相关的源数。 在这项工作中, 我们利用 GPUs 将高- K 稀疏的源数放大为比以前更大的矩阵, 同时提供最先进的执行时间。 我们可以透明地将计算方法分成多个 GPUs、 进程超出核心矩阵、 调整精度和执行时间, 使用混合精度浮点算法。 总的来说, 我们比高度优化的 ARPACK 图书馆在104 行读的 CPU 上运行的速度快67倍, 比 最近的 FPGA 硬件设计速度快1. 9 倍。 我们还确定混合精度浮点算法如何将执行时间比双精度增加50%, 并且比单精度浮点算法要精确12倍 。