硬件认知和稳定的正正孔化框架 (A Hardware-aware and Stable Orthogonalization Framework)

The orthogonalization process is an essential building block in Krylov space methods, which takes up a large portion of the computational time. Commonly used methods, like the Gram-Schmidt method, consider the projection and normalization separately and store the orthogonal base explicitly. We consider the problem of orthogonalization and normalization as a QR decomposition problem on which we apply known algorithms, namely CholeskyQR and TSQR. This leads to methods that solve the orthogonlization problem with reduced communication costs, while maintaining stability and stores the orthogonal base in a locally orthogonal representation. Furthermore, we discuss the novel method as a framework which allows us to combine different orthogonalization algorithms and use the best algorithm for each part of the hardware. After the formulation of the methods, we show their advantageous performance properties based on a performance model that takes data transfers within compute nodes as well as message passing between compute nodes into account. The theoretic results are validated by numerical experiments.

翻译：Krylov 空间方法中, 矩形化过程是一个基本构件, 它占用了大部分计算时间。常用的方法, 如 Gram- Schmidt 方法, 分别考虑投影和正常化, 并明确存储正方基。我们认为正方形化和正常化问题是一个QR 分解问题, 我们在此应用已知的算法, 即 CholeskyQR 和 TSQR 。这导致以降低通信成本的方式解决正方形化问题的方法, 同时维持稳定性, 并将正方形基存放在本地或方位代表中。此外, 我们讨论新颖的方法, 作为框架, 使我们能够将不同的正方形算算法结合起来, 并对硬件的每个部分使用最佳算法。方法的形成后, 我们根据在计算节点内进行数据传输的性能模型, 以及计算节点之间传递的信息, 显示其有利的性能属性。数学结果通过数字实验得到验证。