As hardware architectures are evolving in the push towards exascale, developing Computational Science and Engineering (CSE) applications depend on performance portable approaches for sustainable software development. This paper describes one aspect of performance portability with respect to developing a portable library of kernels that serve the needs of several CSE applications and software frameworks. We describe Kokkos Kernels, a library of kernels for sparse linear algebra, dense linear algebra and graph kernels. We describe the design principles of such a library and demonstrate portable performance of the library using some selected kernels. Specifically, we demonstrate the performance of four sparse kernels, three dense batched kernels, two graph kernels and one team level algorithm.
翻译:随着硬件结构在向伸缩的推进过程中不断发展,开发计算科学和工程(CSE)应用取决于可操作的可移动性可持续软件开发方法,本文描述了开发一个可移动的内核库以满足若干CSE应用和软件框架需要的可移动内核库的可操作性的一个方面。我们描述了Kokkos Kernels,一个用于稀薄线性代数、稠密线性代数和图形内核的内核的内核库。我们描述了这样一个图书馆的设计原则,并用一些选定的内核展示了图书馆的可移动性性能。具体地说,我们展示了四个稀疏的内核、三个密集的分批式内核、两个图形内核和一个团队级算法的性能。