与 MPI 平行稀散多矢量的无矩阵化有限元素运算符的数值和数据结构 (Algorithms and data structures for matrix-free finite element operators with MPI-parallel sparse multi-vectors)

Traditional solution approaches for problems in quantum mechanics scale as $\mathcal O(M^3)$, where $M$ is the number of electrons. Various methods have been proposed to address this issue and obtain linear scaling $\mathcal O(M)$. One promising formulation is the direct minimization of energy. Such methods take advantage of physical localization of the solution, namely that the solution can be sought in terms of non-orthogonal orbitals with local support. In this work a numerically efficient implementation of sparse parallel vectors within the open-source finite element library deal.II is proposed. The main algorithmic ingredient is the matrix-free evaluation of the Hamiltonian operator by cell-wise quadrature. Based on an a-priori chosen support for each vector we develop algorithms and data structures to perform (i) matrix-free sparse matrix multivector products (SpMM), (ii) the projection of an operator onto a sparse sub-space (inner products), and (iii) post-multiplication of a sparse multivector with a square matrix. The node-level performance is analyzed using a roofline model. Our matrix-free implementation of finite element operators with sparse multivectors achieves the performance of 157 GFlop/s on Intel Cascade Lake architecture. Strong and weak scaling results are reported for a typical benchmark problem using quadratic and quartic finite element bases.

翻译：针对量子力学规模问题的传统解决办法,如$gmathcal O(M3)3美元(M33美元),其中M美元是电子数量。提出了各种方法解决这一问题并获得线性缩放 $mathcal O(M)美元。一种有希望的提法是直接将能源降到最低。这种方法利用了解决办法的物理本地化,即可以在当地支持下从非横向轨道上寻求解决办法。在这项工作中,提议在开放源源有限元素库协议中以数字方式高效率地执行稀薄平行矢量。II的主要算法成分是用细胞四边框对汉密尔顿操作员进行无基评价。基于对每种矢量的优先选择支持,我们开发了算法和数据结构,以进行:(一) 无基质的稀释矩阵多变量多变量产品(Spmmmmmmmmm),(二)将操作员投向稀薄的亚空间(内产品)投影,以及(三)以平方矩阵为稀散多维矢量的多矢量矢量。节点性性性性性性性性性性性性性性性性性性性性性性性性性性性性性性性能是用一个固定性基质质质化的固定性基质化的基质性基质性基件。我们在性基质性基质性基质性能在高质性基质性基中进行分析。我们公司在高质性基质性基底上进行分析。