Krylov subspace methods are extensively used in scientific computing to solve large-scale linear systems. However, the performance of these iterative Krylov solvers on modern supercomputers is limited by expensive communication costs. The $s$-step strategy generates a series of $s$ Krylov vectors at a time to avoid communication. Asymptotically, the $s$-step approach can reduce communication latency by a factor of $s$. Unfortunately, due to finite-precision implementation, the step size has to be kept small for stability. In this work, we tackle the numerical instabilities encountered in the $s$-step GMRES algorithm. By choosing an appropriate polynomial basis and block orthogonalization schemes, we construct a communication avoiding $s$-step GMRES algorithm that automatically selects the optimal step size to ensure numerical stability. To further maximize communication savings, we introduce scaled Newton polynomials that can increase the step size $s$ to a few hundreds for many problems. An initial step size estimator is also developed to efficiently choose the optimal step size for stability. The guaranteed stability of the proposed algorithm is demonstrated using numerical experiments. In the process, we also evaluate how the choice of polynomial and preconditioning affects the stability limit of the algorithm. Finally, we show parallel scalability on more than 14,000 cores in a distributed-memory setting. Perfectly linear scaling has been observed in both strong and weak scaling studies with negligible communication costs.
翻译:Krylov 子空间方法被广泛用于科学计算,以解决大型线性系统。 然而, 这些反复的 Krylov 解决方案在现代超级计算机上的性能受到昂贵的通信成本的限制。 美元分步战略产生一系列美元Krylov 矢量, 以避免通信。 简便地, 美元分步法可以将通信延迟率降低以美元计。 不幸的是, 由于实施有限精度, 步骤大小必须保持小小点, 才能稳定。 在这项工作中, 我们解决了在美元分步GMRES算法中遇到的数字不稳定性。 通过选择一个适当的多级基数基础和块或分步法计划, 我们建起一个避免美元分步制的通信算法, 自动选择最佳步骤大小的Grylov 矢量法, 以确保数字稳定。 为了进一步最大限度地节省通信费用, 我们引入了规模缩小的Newton 多元多级数, 对于许多问题来说, 最初的缩放大小也是用来选择最优级级的步数大小。 我们的缩缩缩缩缩缩缩的算法, 最后我们展示了一个稳定的缩缩缩的缩缩缩缩的算法, 的缩缩缩缩的缩的缩缩缩缩缩的缩的缩缩缩的缩缩的缩略图是我们的缩略的缩缩的缩的缩的缩的缩略的缩略图。</s>