This paper presents a parallel preconditioning approach based on incomplete LU (ILU) factorizations in the framework of Domain Decomposition (DD) for general sparse linear systems. We focus on distributed memory parallel architectures, specifically, those that are equipped with graphic processing units (GPUs). In addition to block Jacobi, we present general purpose two-level ILU Schur complement-based approaches, where different strategies are presented to solve the coarse-level reduced system. These strategies are combined with modified ILU methods in the construction of the coarse-level operator, in order to effectively remove smooth errors. We leverage available GPU-based sparse matrix kernels to accelerate the setup and the solve phases of the proposed ILU preconditioner. We evaluate the efficiency of the proposed methods as a smoother for algebraic multigrid (AMG) and as a preconditioner for Krylov subspace methods, on challenging anisotropic diffusion problems and a collection of general sparse matrices.
翻译:本文提出了一种基于不完全 LU 因子分解的 DD(域分解)框架下的并行预处理方法,用于解决一般稀疏线性系统。 我们着重关注带有图形处理器 (GPU) 的分布式存储并行架构。 除块列多比方法外,我们还提出了基于 Schur 补的两级不完全 LU 预处理器方法,其中采用不同的策略来解决粗略级别的缩减系统问题。 为了有效地去除平滑误差,这些策略与修改后的 ILU 方法相结合来构建粗略级别的算子。 我们利用可用的基于 GPU 的稀疏矩阵核来加速所提出的 ILU 预处理器的设置和求解阶段。 我们评估了所提出的方法作为代数多重网格 (AMG) 平滑器以及 Krylov 子空间方法的预处理器的有效性,在具有挑战性的各向异性扩散问题和一系列一般稀疏矩阵上。