Sparse matrix multiplication operators (i.e., SpMM and SDDMM) are widely used in deep learning and scientific computing. Modern accelerators are commonly equipped with Tensor Core Units (TCUs) and CUDA cores to accelerate sparse operators. The former excels at structured matrix computations, whereas the latter offers greater programming flexibility. However, how to combine these two resources to maximize sparse-operator performance remains unclear. In this work, we first identify the source of performance gains in hybrid computation and systematically analyze their complementary strengths. Motivated by this, we propose Libra, a holistic framework that efficiently leverages heterogeneous computing resources to accelerate both SpMM and SDDMM operators. Specifically, Libra introduces a 2D-aware (locality and utilization) workload distribution method to precisely identify the optimal task mapping, simultaneously leveraging the data reuse capabilities of TCUs and the flexibility of CUDA cores to minimize computational redundancy. Libra further incorporates hybrid load balancing, occupancy-aware task scheduling, and efficient kernel implementations to maximize execution efficiency. Extensive experiments on H100 and RTX 4090 GPUs demonstrate that Libra surpasses all the 12 up-to-date baselines significantly, e.g., on average 1.77x speedup over FlashSparse, 1.73x over RoDe, and 2.9x over DGL for end-to-end GNN applications. Libra opens up a new perspective for sparse operator acceleration by fully unleashing the power of heterogeneous GPU resources.


翻译:稀疏矩阵乘法算子(即SpMM和SDDMM)在深度学习和科学计算中广泛应用。现代加速器通常配备张量核心单元(TCU)和CUDA核心以加速稀疏算子。前者擅长结构化矩阵计算,而后者提供更强的编程灵活性。然而,如何结合这两种资源以最大化稀疏算子性能仍不明确。在本工作中,我们首先识别混合计算中性能增益的来源,并系统分析其互补优势。基于此,我们提出Libra——一个高效利用异构计算资源以加速SpMM和SDDMM算子的整体框架。具体而言,Libra引入一种二维感知(局部性与利用率)工作负载分配方法,以精确识别最优任务映射,同时利用TCU的数据重用能力和CUDA核心的灵活性来最小化计算冗余。Libra进一步整合混合负载均衡、占用感知任务调度及高效内核实现,以最大化执行效率。在H100和RTX 4090 GPU上的大量实验表明,Libra显著超越所有12个最新基线方法,例如在端到端GNN应用中平均比FlashSparse快1.77倍、比RoDe快1.73倍、比DGL快2.9倍。Libra通过充分释放异构GPU资源的潜力,为稀疏算子加速开辟了新视角。

0
下载
关闭预览

相关内容

Top
微信扫码咨询专知VIP会员