Sparse matrices and linear algebra are at the heart of scientific simulations. Over the years, more than 70 sparse matrix storage formats have been developed, targeting a wide range of hardware architectures and matrix types, each of which exploit the particular strengths of an architecture, or the specific sparsity patterns of the matrices. In this work, we explore the suitability of storage formats such as COO, CSR and DIA for emerging architectures such as AArch64 CPUs and FPGAs. In addition, we detail hardware-specific optimisations to these targets and evaluate the potential of each contribution to be integrated into Morpheus, a modern library that provides an abstraction of sparse matrices (currently) across x86 CPUs and NVIDIA/AMD GPUs. Finally, we validate our work by comparing the performance of the Morpheus-enabled HPCG benchmark against vendor-optimised implementations.
翻译:稀疏矩阵和线性代数是科学模拟的核心。多年来,已经开发了70多种稀疏矩阵存储格式,针对各种硬件架构和矩阵类型,每种格式都利用架构的特定优势或矩阵的特定稀疏模式。在这项工作中,我们探讨了COO、CSR和DIA等存储格式在新兴架构,例如AArch64 CPU和FPGA上的适用性。此外,我们详细介绍了针对这些目标的硬件特定优化,并评估了每个贡献被整合到Morpheus中的潜力,Morpheus是一种现代化的库,提供跨x86 CPU和NVIDIA/AMD GPU的稀疏矩阵抽象。最后,我们通过比较Morpheus启用的HPCG基准测试的性能与厂商优化实现进行验证。