This paper lays out insights and opportunities for implementing higher-precision matrix-matrix multiplication (GEMM) from (in terms of) lower-precision high-performance GEMM. The driving case study approximates double-double precision (FP64x2) GEMM in terms of double precision (FP64) GEMM, leveraging how the BLAS-like Library Instantiation Software (BLIS) framework refactors the Goto Algorithm. With this, it is shown how approximate FP64x2 GEMM accuracy can be cast in terms of ten ``cascading'' FP64 GEMMs. Promising results from preliminary performance and accuracy experiments are reported. The demonstrated techniques open up new research directions for more general cascading of higher-precision computation in terms of lower-precision computation for GEMM-like functionality.
翻译:本文件阐述了从(从)低精度高性能GEMM进行高精度矩阵矩阵乘法(GEMM)的洞察力和机会。驾驶案例研究从双精度(FP64x2)来看,近似双精度(FP64x2)GEMM(FP6464) GEMM(GEMM),利用BLAS类图书馆识别软件(BLIS)框架对Goto Algorithm(Goto Algorithm)进行反射。通过这一方法,可以显示如何用10个“Cascating' FP64x2 GEMM(F64 GEMM)”的精确度来显示近似F64x2 GEMM的准确度。报告了初步性能和精确性实验的预期结果。演示技术为更笼统地计算GEMM类似功能的更高精度计算提供了新的研究方向。</s>